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(diagnostic method practised on) the human/animal body, the search 
has been carried out and based on the alleged effects of the 
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2. Claims Nos.: 
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claims : 

1. 1,2,5-51,53,54,56 to 59 (partially): 

Genotypes specific peptides from El Seq. ID 1-8 and 52-59 used 
in the recombiannt protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype I/la. 

2. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 103-108 and 
155-160 used in the recombinant protein expression, detection 
of antibodies against HCV, vaccines and methods of detection 
using PCR primers and Identification of Genotype I/la. 

3. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 9-25 and 60-76 used 
1n the recombiannt protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype Il/lb. 

4. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 109-124 and 161-176 

used in the recombinant protein expression, detetion of antibodies 

against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype Il/lb. 

5. 1,2,5-51,53,54,56 to 59 (partially): 

Genotyoe specific peptides from El Seq. ID 26-29 and 77-80 used 
in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype III/2a. 

6. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 125-128 and 177-180 
used 1n the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype III/2a. 

7. 1,2,5-51,53,54,56 to 59 (partially): 

Genotvoe specific peptides from El Seq. ID 30-33 and 81-84 used 1n 
the recombinan; protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype IV/2b. 
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8. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 129-133 and 181-185 
used in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype IV/2b. 

9. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 34 and 85 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype IV/2c. 

10. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 134 and 186 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype IV/2c. 

11. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 35-39 and 86-90 used 
in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype V/3a. 

12. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 135-138 and 187-190 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype V/3a. 

13. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 40 and 91 used in the 
recombiannt protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4a. 

14. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 139 and 191 used in 

the recombinant protein expression, detection of antibodies against 

HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4a. 
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15. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 41 and 92 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4b. 

16. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 141 and 193 used in 

the recombiannt protein expression, detection of antibodies aginst 

HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4b. 

17. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 42-43 and 93-94 used 
in the recombinant protein expression , detectin of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 4c. 

18. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 143-144 and 195-196 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype 4c. 



19. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 44 and 95 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4d. 

20. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 145 and 197 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 4d. 

21. 3-52,55 and 59 (partially): 

Genotype specific peptides Core Seq. ID 142 and 194 used 1n the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4e. 
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22. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 140 and 192 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and metjods of detection using PCR primers 
and Identification of Genotype 4f. 

23. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 45-50 and 96-101 used 
in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 5a. 

24. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. 146-153 and 198-205 
used 1n the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype 5a. 



25. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 51 ans 102 used 1n the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 6a. 

26. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 154 and 206 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 6a. 
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Title of the Invention 



NUCLEOTIDE AND AMINO ACID SEQUENCES 
OF THE ENVELOPE 1 AND CORE GENES 
OF HEPATITIS C VIRUS 



The present application is a continuation-in-part 
of pending U.S. Application Serial No. 08/086/428, filed on 
June 29, 1993. 

10 Field Of Invention 

The present invention is in the field of 
hepatitis virology. The invention relates to the complete 
nucleotide and deduced amino acid sequences of the envelope 
1 (El) and core genes of hepatitis C virus (HCV) isolates 

15 from around the world and the grouping of these isolates 
into fourteen distinct HCV genotypes. More specifically, 
this invention relates to oligonucleotides, peptides and 
recombinant proteins derived from the envelope 1 and core 
gene sequences of these isolates of hepatitis C virus and 

20 to diagnostic methods and vaccines which employ these 
reagents . 

Background Of Invention 
Hepatitis C, originally called non-A, non-B 

25 hepatitis, was first described in 1975 as a disease 

serologically distinct from hepatitis A and hepatitis B 
(Feinstone, S.M. et al . (1975) N. Engl. J. Med. 292:767- 
770) . Although hepatitis C was (and is) the leading type 
of transfusion-associated hepatitis as well as an important 

30 part of community- acquired hepatitis, little progress was 
made in understanding the disease until the recent 
identification of hepatitis C virus (HCV) as the causative 
agent of hepatitis C via the cloning and sequencing of the 
HCV genome (Choo, A. L . et al . (1989) Science 288:359-362). 

25 The sequence information generated by this study resulted 
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° in the characterization of HCV as a small, enveloped, 

positive-stranded RNA virus and led to the demonstration 
that HCV is a major cause of both acute and chronic 
hepatitis worldwide (Weiner, A.J. et al . (1990) Lancet 
335:1-3). These observations, combined with studies 
5 showing that over 50% of acute cases of hepatitis C 

progress to chronicity with 20% of these resulting in 
cirrhosis and an undetermined proportion progressing to 
liver cancer, have led to tremendous efforts by 
investigators within the hepatitis C field to develop 
10 diagnostic assays and vaccines which can detect and prevent 
hepatitis C infection. 

The cloning and sequencing of the HCV genome by 
Choo et al. (1969) has permitted the development of 
serologic tests which can detect HCV or antibody to HCV 
15 (Kuo, G. et al. (1989) Science 244:362-364). In addition, 

the work of Choo et al . has also allowed the development of 
methods for detecting HCV infection via amplification of 
HCV RNA sequences by reverse transcription and cDNA 
polymerase chain reaction (RT-PCR) using primers derived 

20 from the HCV genomic sequence (Weiner, A.J. et al . ) . 

However, although the development of these diagnostic 
methods has resulted in improved diagnosis of HCV 
infection, only approximately 60% of cases of hepatitis C 
are associated with a factor identified as contributing to 

25 transmission of HCV (Alter, M.J. et al . (1989) JAMA 

262:1201-1205). This observation suggests that effective 
control of hepatitis C transmission is likely to occur only 
via universal pediatric vaccination as has been initiated 
recently for hepatitis B virus. Unfortunately, attempts to 

30 date to protect chimpanzees from hepatitis C infection via 
administration of recombinant vaccines have had only 
limited success. Moreover, the apparent genetic 
heterogeneity of HCV, as indicated by the recent assignment 
of all available HCV isolates to one of four genotypes, I- 

35 IV (Okaraoto, H. et al . (1992) J. Gen. Virol; 73:673-679), 
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° presents additional hurdles which must be overcome in order 
to develop accurate and effective diagnostic assays and 
vaccines . 

For example, one possible obstacle to the 
development of effective hepatitis C vaccines would arise 
5 if the observed genetic heterogeneity of HCV reflects 
serologic heterogeneity. In such a case, the most 
genetically diverse strains of HCV may then represent 
different serotypes of HCV with the result being that 
infection with one strain may not protect against infection 
10 with another. Indeed, the inability of one strain to 

protect against infection with. another strain was recently 
noted by both Farci et al . (Farci, P. et al . (1992) Science 
258:135-140) and Prince et al. (Prince, A.M. et al . (1992) 
J. Infect. Dis. 165:438-443), each of whom presented 
15 evidence that while infection with one strain of HCV does 
modify the degree of the hepatitis C associated with the 
reinfection, it does not protect against reinfection with a 
closely related strain. The genetic heterogeneity among 
different HCV strains also increases the difficulty 
20 encountered in developing RT-PCR assays to detect HCV 

infection since such heterogeneity often results in false- 
negative results because of primer and template mismatch. 
In addition, currently used serologic tests for detection 
of HCV or for detection of antibody to HCV are not 
sufficiently well developed to detect all of the HCV 
genotypes which might exist in a given blood sample . 
Finally, in terms of choosing the proper treatment modality 
to combat hepatitis infection, the inability of presently 
available serologic assays to distinguish among the various 
genotypes of HCV represents a significant shortcoming in 
that recent reports suggest that an HCV-infected patient's 
response to therapy might be related to the genotype of the 
infectious virus (Yoshioka, K. et al . (1992) Hepatology 
16:293-299; Kanai , K. et al . (1992) Lancet 339:1543; Lan, 
35 J.Y.N, et al. (1992) Hepatology 16:209A). Indeed, the data 
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° presented in the above studies suggest that the closely 
related genotypes I and II are less responsive to 
interferon therapy than are the closely related genotypes 
III and IV. Moreover, preliminary data by Pozzato et al . 
(Pozzato, G. et al . (1991) Lancet 338:509) suggests that 
5 different genotypes may be associated with different types 
or degrees of clinical disease. Taken together, these 
studies suggest that before effective vaccines against HCV 
infection can be developed, and indeed, before more 
accurate and effective methods for diagnosis and treatment 
10 of HCV infection can be produced, one must obtain a greater 
knowledge about the genetic and serologic diversity of HCV 
isolates . 

In a recent attempt to gain an understanding of 
the extent of genetic heterogeneity among HCV strains, Bukh 
15 . et ^ al. carried, out a detailed analysis of HCV isolates via 
the use of PCR technology to amplify different regions of 
the HCV genome (Bukh, J. et al . (1992a) Proc . Natl. Acad. 
Sci. 89:187-191). Following PCR amplification, the 5'- 
noncoding (5' NC) portion of the genomes of various HCV 

20 isolates were sequenced and it was found that primer pairs 
designed from conserved regions of the 5' NC region of the 
HCV genome were more sensitive for detecting the presence 
of HCV than were primer pairs representing other portions 
of the genome (Bukh, J. et al . (1992b) Proc. Natl. Acad. 

25 Sci. U.S.A. 89:4942-4946). In addition, the authors noted 
that although many of the HCV isolates examined could be 
classified into the four genotypes described by Okatnoto et 
al. (1992), other previously undescribed genotypes emerged 
based on genetic heterogeneity observed in the 5' NC region 

30 of the various isolates. One of the most prominent of 

these newly noted genotypes comprised a group of related 
viruses that contained the most genetically divergent 5' NC 
regions of those studied. This group of viruses, 
tentatively classified as a fifth genotype, are very 

35 similar to strains recently described by others (Cha, T.-A 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCIYUS95/10398 



- 5 - 

0 et al . (1992) Proc . Natl. Acad. Sci. U. S . A. 89 : 7144 -7148 ; 
Chan, S-W. et al . (1992) J. Gen. Virol-., 73:1131-1141 and 
Lee, C-H et al . (1992) J. Clin. Microbio. 30:1602-1604). 
In addition, at least four more putative genotypes were 
identified thereby providing evidence that the genetic 

5 heterogeneity of HCV was more extensive than previously 
appreciated. 

However, while the studies of Bukh et al . (1992a 
and b) provided new and useful information on the genetic 
heterogeneity of HCV, it is widely appreciated by those 

10 skilled in the art that the three structural genes of HCV, 
core (C) , envelope (El) and envelope 2/nonstructural 1 
(E2/NS1) are the most important for the development of 
serologic diagnostics and vaccines since it is the product 
of these genes that constitutes the hepatitis C virion. 

15 Thus,, a determination of the nucleotide sequence of one or 
..all , of the structural genes of a variety of HCV isolates 
would be useful in designing reagents for use in diagnostic 
assays and vaccines since a demonstration of genetic 
heterogeneity in a structural gene(s) of HCV isolates might 

20 suggest that some of the HCV genotypes represent distinct 
serotypes of HCV based upon the previously observed 
relationship between genetic heterogeneity and serologic 
heterogeneity among another group of single-stranded, 
positive-sense RNA viruses, the picornaviruses (Ruechert, 

25 R.R. "Picornaviridae and their replication", in Fields, 
B.N* et al . , eds . Virology, New York: Raven Press, Ltd. 
(1990) .507-548) . 

Summary of Invention 
3Q The present invention relates to cDNAs encoding 

the complete nucleotide sequence of either the envelope 1 
(El) gene or the. core (C) gene of an isolate of human 
hepatitis C virus (HCV). 

The present invention also relates to the nucleic 
3* acid and deduced amino acid sequences of these El and core 



SUBSTITUTE SHEET (RULE 26) 



* - - - . . - •• ^p^^ 

WO 96/05315 PCT/US95/10398 



cDNAs . 

It is an object of this invention to provide 
synthetic nucleic acid sequences capable of directing 
production of recombinant El and core proteins, as well as 
equivalent natural nucleic acid sequences. Such natural 
5 nucleic acid sequences may be isolated from a cDNA or 

genomic library from which the gene capable of directing 
synthesis of the El or core proteins may be identified and 
isolated. For purposes of this application, nucleic acid 
sequence refers to RNA, DNA, cDNA or any synthetic variant 

10 thereof which encodes for peptides. 

The invention also relates to the method of 
preparing recombinant El and core proteins derived from El 
and core cDNA sequences respectively by cloning the nucleic 
acid encoding either the recombinant El or core protein and 

15 - inserting the cDNA into an expression vector and expressing 
the recombinant protein in a host cell. 

The invention also relates to isolated and 
substantially purified recombinant El and core proteins and 
analogs thereof encoded by El and core cDNAs respectively. 

20 The invention further relates to the use of 

recombinant El and core proteins, either alone, or in 
combination with each other, as diagnostic agents and as 
vaccines . 

The present invention also relates to the 
25 recombinant production of the core protein of the present 
invention to contain a second protein on its surface and 
therefore serve as a carrier in a multivalent vaccine 
preparation. Further, the present invention relates to the 
use of the self aggregating core or envelope proteins as a 
30 drug delivery system for anti-virals. 

The invention also relates to the use of single- 
stranded antisense poly- or oligonucleotides derived from 
El or core cDNAs , or from both El and core cDNAs, to 
inhibit expression of hepatitis C El and/or core genes. 
35 The invention further relates to multiple 
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° computer-generated alignments of the nucleotide and deduced 
amino acid sequences of the El and core cDNAs . These 
multiple sequence alignments produce consensus sequences 
which serve to highlight regions of homology and non- 
homology between sequences found within the same genotype 
5 or in different genotypes and hence, these alignments can 
be used by one skilled in the art to design peptides and 
oligonucleotides useful as reagents in diagnostic assays 
and vaccines . 

The invention therefore also relates to purified 
10 and isolated peptides and analogs thereof derived from El 
and core cDNA sequences . . . 

The invention further relates to the use of these 
peptides as . diagnostic agents and vaccines . . 

The present invention also encompasses methods of 
15 detecting antibodies specific for hepatitis C virus in 
biological samples. The methods of detecting HCV or 
antibodies to HCV disclosed in the present invention are 
useful for diagnosis of infection and disease caused by HCV 
and for monitoring the progression of such disease. Such 
20 methods are also useful for monitoring the efficacy of 

therapeutic agents during the course of treatment of HCV 
infection and disease in a mammal. 

The invention also provides a kit for the 
detection of antibodies specific for HCV in a biological 
25 sample where said kit contains at least one purified and 
isolated peptide derived from the El or core cDNA 
sequences. In addition, the invention provides for a kit 
containing at least one purified and isolated peptide 
derived from the El cDNA sequences and at least one 
30 purified and isolated peptide derived from the core cDNA 
sequences . 

The invention further provides isolated and 
purified genotype-specific oligonucleotides and analogs 
thereof derived from El and core cDNA sequences . 
25 The invention also relates to methods for 
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detecting the presence of hepatitis C virus in a mammal, 
said methods comprising analyzing the RNA of a mammal for 
the presence of hepatitis C virus. The invention further 
relates to methods for determining the genotype of 
hepatitis C virus present in a mammal. This method is 
5 useful in determining the proper course of treatment for 
an HCV- infected patient. 

The invention also provides a diagnostic kit for 
the detection of hepatitis C virus in a biological sample. 
The kit comprises purified and isolated nucleic acid 
10 sequences useful as primers for reverse- transcription 

polymerase chain reaction (RT-PCR) analysis of RNA for the 
presence of hepatitis C virus genomic RNA. 

The invention further provides a diagnostic kit 
for the determination of the genotype . of a hepatitis C 
15 virus present in a mammal. The kit comprises purified and 
isolated nucleic acid sequences useful as primers for RT- 
PCR analysis of RNA for the presence of HCV in a 
biological sample and purified and isolated nucleic acid 
sequences useful as hybridization probes in determining 
20 the genotype of the HCV isolate detected in PCR analysis. 

This invention also relates to pharmaceutical 
compositions useful in prevention or treatment of 
hepatitis C in a mammal. 

25 Description of Figures 

Figures 1A-1 thru 1H-5 show computer generated 
sequence alignments of the nucleotide sequences of 51 HCV 
El cDNAs. The single letter abbreviations used for the 
nucleotides shown in Figures 1A-1 thru 1H-5 are those 

30 standardly used in the art. Figures 1A-1 thru 1A-10 show 
the alignment of SEQ ID N0s:l-8 to produce a consensus 
sequence for genotype I/la. Figures 1B-1 thru 1B-10 show 
the alignment of SEQ ID NOs:9-25 to produce a consensus 
sequence for genotype Il/lb. Figures 1C-1 thru 1C-5 show 

35 the alignment of SEQ ID NOs : 26-29 to produce a consensus 
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° sequence for genotype III/2a. Figures 1D-1 thru 1D-5 show 
the alignment of SEQ ID NOs: 30-33 to produce a consensus 
sequence for genotype IV/2b. Figures 1E-1 thru IE -5 show 
the alignment of SEQ ID NOs : 35-3 9 to produce a consensus 
sequence for genotype V/3a. Figures 1F-1 thru IF- 3 show 
5 the computer alignment of SEQ ID NOs: 42-43 to produce a 

•■ consensus" sequence for genotype 4C where the "consensus" 
sequence given is that of SEQ ID NO: 42. Figures 1G-1 thru 
1G-5 show the alignment of SEQ ID NOs: 45-50 to produce a 
consensus sequence for genotype 5a. The nucleotides shown 

10 in capital letters in the consensus sequences of Figures 
1A-1 thru 1G-5' are those conserved within a genotype 
while nucleotides shown in lower case letters in the 
consensus sequences are those variable within a genotype . 
In addition, in Figures 1A-1 thru 1E-5 and 1G-1 thru 1G-5, 

15 when the lower case letter is shown in a consensus 

sequence, the lower case letter represents the nucleotide 
found most frequently in the sequences aligned to produce 
the consensus sequence. In Figures 1F-1 thru IF- 3, the 
lower. case letters shown in the consensus sequence are 

20 nucleotides in SEQ ID NO: 42 which differ from nucleotides 
found in the same positions in SEQ ID NO:43. Finally, a 
hyphen at a nucleotide position in the consensus sequences 
in Figures 1A-1 thru 1G-5 indicates that two nucleotides 
were found in equal numbers at that position in the 

25 aligned sequences. In the aligned sequences, nucleotides 
are shown in lower case letters if they differed from the 
nucleotides of both adjacent isolates. Figures 1H-1 thru 
1H-5 show the alignment of the consensus sequences of 
Figures 1A-1 thru 1G-5 with SEQ ID NO: 34 (genotype 2c) , 

30 SEQ ID N0:40 (genotype 4a), SEQ ID NO:41 (genotype 4b), 

SEQ ID NO: 44 (genotype 4d) and SEQ ID NO: 51 (genotype 6a) 
to produce a consensus sequence for all twelve genotypes. 
This consensus sequence is shown as the bottom line of 
Figures 1H-1 thru 1H-5 where the nucleotides shown in 

35 capital letters are conserved among all genotypes and a 
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blank space indicates that the nucleotide at that position 
is not conserved among all genotypes. 

Figures 2A-1 thru 2H-2 show computer alignments 
of the deduced amino acid sequences of 51 HCV El cDNAs. 
The single letter abbreviations used for the amino acids 
shown in Figures 2A-1 thru 2H-2 follow the conventional 
amino acid shorthand for the twenty naturally occurring 
amino acids. Figures 2A-1 thru 2A-4 show the alignment of 
SEQ ID NOs: 52-59 to produce a consensus sequence for 
genotype I/la. Figures 2B-1 thru 2B-4 show the alignment 
of SEQ ID NOs : 60-76 to produce a consensus sequence for 
genotype Il/lb. Figures 2C-1 and 2C-2 show the alignment 
of SEQ ID NOs: 77-80 to produce a consensus sequence for 
genotype III/2a. Figures 2D-1 and 2D-2 show the alignment 
of SEQ ID NOs: 81-84 to produce a consensus sequence for 
15 genotype IV/2b. Figures 2E-1 and 2E-2 show the alignment 
of SEQ ID NOs: 86-90 to produce a consensus sequence for 
genotype V/3a. Figure 2F-1 shows the computer alignment 
of SEQ ID NOs : 93 - 94 to produce a consensus sequence for 
genotype 4c. Figures 2G-1 and 2G-2 show the alignment of 
SEQ ID NOs : 96-101 to produce a consensus sequence for 
genotype 5a. The amino acids shown in capital letters in 
the consensus sequences of Figures 2A-1 thru 2G-2 are 
those conserved within a genotype while amino acids shown 
in lower case letters in the consensus sequences are those 
25 variable within a genotype. In addition, in Figures 2A-1 
thru 2E-2 and 2G-1 thru 2G-2 when -the lower case letter is 
shown in a consensus sequence, the letter represents the 
amino acid found most frequently in the sequences aligned 
to produce the consensus sequence. In Figure 2F-1, the 
lower case letters shown in the consensus sequence are 
amino acids in SEQ ID NO: 93 which differ from amino acids 
found in the same positions in SEQ ID NO: 94. Finally, a 
hyphen at an amino acid position in the consensus 
sequences of Figures 2A-1 thru 2G-2 indicates that two 
35 amino acids were found' in equal numbers at that position 
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in the aligned sequences. In the aligned sequences, amino 
acids are shown in lower case letters if they differed 
from the amino acids of both adjacent isolates. Figures 
2H-1 and 2H-2 show the alignment of the consensus 
sequences of Figures 2A-1 thru 2G-2 with SEQ ID NO: 85 
5 (genotype 2c) , SEQ ID NO:91 (genotype 4a), SEQ ID NO:92 
(genotype 4b), SEQ ID NO: 95 (genotype 4d) and SEQ ID 
NO:102 (genotype 6a) to produce a consensus sequence for 
all twelve genotypes. This consensus sequence is shown as 
the bottom line of Figures 2H-1 and 2H-2 where the amino 

10 acids shown in capital letters are conserved among all 

genotypes and a blank space indicates that the amino acid 
at that position is not conserved among all genotypes. 

Figures 3A and 3B show multiple sequence 
alignment of the deduced amino acid sequence of the El 

15 gene of 51 HCV isolates collected worldwide. The 

consensus sequence of the El protein is shown in boldface 
(top) . In the consensus sequence cysteine residues are 
highlighted with stars, potential N- linked glycosylation 
sites are underlined, and invariant amino acids are 

20 capitalized, whereas variable amino acids are shown in 
lower case letters. In the alignment, amino acids are 
shown in lower case letters if they differed from the 
amino acid of both adjacent isolates. Amino acid residues 
shown in bold print in the alignment represent residues 

25 which at that position in the amino acid sequence are 

genotype - specific . Amino acids that were invariant among 
all HCV isolates are shown as hyphens (-) in the 
alignment. Amino acid positions correspond to those of 
the HCV prototype sequence (HCV-1, Choo, L . et al . (1991) 

30 Proc. Natl. Acad. Sci . USA 88:2451-2455) with the first 
amino acid of the El protein at position 192. The 
grouping of isolates into 12 genotypes (I/la, I I /lb, 
III/2a, IV/2b, V/3a, 2c, 4a, 4b, 4c, 4d, 5a and 6a) is 
indicated. 

35 Figure 4 shows a dendrogram of the genetic 
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relatedness of the twelve genotypes of HCV based on the 
percent amino acid identity of the El gene of the HCV 
genome. The twelve genotypes shown are designated as 
I/la, Il/lb,. 111/23, IV/2b, V/3a, 2c, 4a, 4b, 4c # 4d, 5a 
and 6a. The shaded bars represent a range showing the 
5 maximum and minimum homology between the amino acid 

sequence of any one isolate of the genotype indicated and 
the amino acid sequence of any other isolate. 

Figure 5 shows the distribution of the complete 
El gene sequence of 74 HCV isolates into the twelve HCV 
10 genotypes in the 12 countries studied. For 51 of these 
HCV isolates, including 8 isolates of genotype I/la, 17 
isolates of genotype Il/lb and 26 isolates comprising the 
additional 10 genotypes, the complete El gene sequence was 
determined. In the remaining 23 isolates, all of 
15 genotypes I/la and Il/lb, the genotype assignment was 

based on only a partial El gene sequence. The partially 
sequenced isolates did not represent additional genotypes 
in any of the 12 countries. The number of isolates of a 
particular genotype is given in each of the 12 countries 

20 studied. For ease of viewing, those genotypes designated 
by two terms (e.g., I/la> are indicated by the latter term 
(e.g. la) . The designations used for each country are: 
Denmark (DK) ; Dominican Republic <DR) ; Germany <D) ; Hong 
Kong (HK) ; India (IND) ; Sardinia, Italy (S) ; Peru (P) ; 

25 South Africa (SA) ; Sweden (SW) ; Taiwan (T) ; United States 
(US) ; and Zaire (2) . National borders depicted in this 
figure represent those existing at the time of sampling. 

Figures 6A-1 thru 6K-2 show computer generated 
sequence alignments of the nucleotide sequences of 52 HCV 

30 core cDNAs. Single letter abbreviations used for the 
nucleotides shown in Figures 6A-1 thru 6J-4 are those 
standardly used in the art. Figures 6A-1 thru 6A-4 show 
the alignment of SEQ ID NOs: 103-108 to produce a 
consensus sequence for genotype I /la. Figures 6B-1 thru 

35 6B-10 show the alignment of SEQ ID NOs : 109-124 to produce 
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a consensus sequence for genotype Il/lb. Figures 6C-1 
thru 6C-10 show the alignments of the sequences comprising 
minor genotypes I/la (SEQ ID NOS : 103-108) and Il/lb (SEQ 
ID NOs: 109-124) to produce a consensus sequence for the 
major genotype, genotype 1. Figures 6D-1 thru 6D-3 show 
5 the alignment of SEQ ID NOs : 125-128 to produce a 

consensus sequence for genotype III/2a. Figures 6E-1 thru 
6E-4 show the alignment of SEQ ID NOs : 129-133 to produce 
a consensus sequence for genotype IV/2b. Figures 6F-1 
thru 6F-5 show the alignment of the sequences of minor 

10 genotypes III/2a (SEQ ID NOs: 125-128), IV/2b (SEQ ID NOs : 
129-133) and 2c (SEQ ID NO: 134) to produce a consensus 
sequence for the major genotype, genotype 2. Figures 6G-1 
thru 6G-3 show the alignment of SEQ ID NOs : 135-138 to 
produce a consensus sequence for genotype V/3a. Figures 

15 6H-1 thru 6H-4. show the computer alignment of the 

sequences of minor genotypes 4a-4f (SEQ ID NOs: 139-145) 
to produce a consensus sequence for the major genotype, 
genotype 4. Figures 61-1 thru 61-4 show the alignment of 
SEQ ID NOs : 14 6-153 to produce a consensus sequence for 

20 genotype 5a, The nucleotides shown in capital letters in 
the consensus sequences in Figures 6A-1 thru 61-4 are 
those conserved within the genotype while nucleotides 
shown in lower case letters in the consensus sequences are 
those variable within a genotype. In addition, when the 

25 lower case letter is shown in the consensus sequence, the 
lower case letter represents the nucleotide found most 
frequently in the sequences aligned to produce that 
consensus sequence. Moreover, a hyphen at a nucleotide 
position in the consensus sequences in Figures 6A-1 thru 

30 61-4 indicates that two nucleotides were found in equal 
numbers at that position in the sequences aligned to 
produce the consensus sequence. Finally, nucleotides are 
shown in lower case letters in the sequences aligned to 
produce each consensus sequence shown in Figures 6A-1 thru 

35 61-4, if they differed from the nucleotides of both 
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adjacent isolates. Figures 6 J- 1 thru 6-14 show the 
alignment of the consensus sequences of major genotypes 1 
(Figures 6C-1 thru 6C-10) , 2 (Figures 6F-1 thru 6F-5) , 3 
(Figures 6G-1 thru 6G-3) , 4 (Figures 6H-1 thru 6H-4) , 5 
(Figures 61-1 thru 61-4) and 6 (SEQ ID NO: 154) to produce 
5 a consensus sequence for all genotypes and Figures 6K-1 
and 6K-2 show the alignment of consensus sequences of 
Figures 6A-1 thru 6A-4, 6B-1 thru 6B-10, 6D-1 thru 6D-3, 
6E-1 thru.6E-4, 6G-1 thru 6G-3 and 61-1 thru 61-4 with SEQ 
ID NO:134 (genotype 2c), SEQ ID NO:139 (genotype 4a), SEQ 
10 ID N0:141 (genotype 4b) , SEQ ID NO:143 (genotype 4c), SEQ 
ID NO:145 (genotype 4d) , SEQ ID NO:142 (genotype 4e) , SEQ 
ID NO: 140 (genotype 4f) and SEQ ID NO: 154 (genotype 6a) to 
produce a consensus sequence for all fourteen genotypes. 
The nucleotides shown in capital letters in the consensus 
15 sequences of Figures 6J-1 thru 6K-2 are conserved among 
all genotypes and the nucleotide shown in lower case 
letter represent the nucleotides found most frequently in 
the sequences aligned to produce this consensus sequence. 
In addition, the presence of a hyphen at a nucleotide 
20 position in all fourteen sequences aligned in Figures 6K-1 
and 6K-2 indicates that the nucleotide found at that 
position in the aligned sequences is the same as 
nucleotide shown at the corresponding position in the 
consensus sequences of Figure 6K. 
25 Figures 7A-1 thru 7J-1 show computer alignments 

of the deduced amino acid sequences of the 52 HCV core 
cDNAs. The single letter abbreviations used for the amino 
acids shown in Figures 7A-1 thru 7J-1 follow the 
conventional amino acid short hand for the twenty natural 
30 occurring amino acids. Figures 7A-1 and 7A-2 show the 
alignment of SEQ ID NOs : 155-160 to produce a consensus 
sequence for genotype I/la. Figures 7B-1 and 7B-2 show 
the alignment of SEQ ID NOs: 161-176 to produce a 
consensus sequence for genotype Il/lb. Figures 7C-1 thru 
35 7C-4 show the alignment of the sequences comprising minor 
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° . genotypes I/a (SEQ ID NOS :. 155-160) and Il/lb (SEQ ID NOS : 
161-176) to produce a consensus sequence for the major 
genotype, genotype 1. Figure 7D-1 shows the alignment of 
SEQ ID NOs: 177-180 to produce a consensus sequence for 
genotype III /2a. Figure 7E-1 shows the alignment of SEQ 
5 ID NOs : 181-185 to produce a consensus sequence for 

genotype IV/2b. Figures 7F-1 and 7F-2 show the alignment 
of the sequences of minor genotypes III/2a (SEQ ID NOS: 
177-180), IV/2b (SEQ ID NOS: 181-185) and 2c (SEQ ID NO: 
186) to produce a consensus sequence for the major 

10 genotype, genotype 2. Figure 7G-1 shows the alignment of 
SEQ ID NOs: 187-190 to produce a consensus sequence for 
genotype V/3a. Figures 7H-1 and 7H-2 shows the computer 
alignment of the sequences of minor genotypes 4a-4f (SEQ 
ID NOs : 191-197) to produce a consensus sequence for the 

15 ma j or genotype genotype 4. Figures 71-1 and 71-2 show 
the alignment of SEQ ID NOs : 198-205 to produce a 
consensus sequence for genotype 5a. The amino acids shown 
in capital letters in the consensus sequences of Figures 
7A-1 thru 71-2 are those conserved within the genotype 

20 while amino acids shown in lower case letters in the 
consensus sequences are those variable within the 
genotype. In addition, when a lower case letter is found 
in the consensus sequences shown in Figures 7A-1 thru 
71-2, the letter represents the amino acid found most 

25 frequently in the sequences aligned to produce that 

consensus sequence. Moreover, a hyphen in an amino acid 
position in the consensus sequences of Figures 7A-1 thru 
71-2 indicates that two amino acids were found in equal 
numbers at that position in the sequences aligned to 

30 produce that consensus sequence. Finally, amino acids are 
shown in lower case letters in the sequences aligned to 
produce the consensus sequences shown in Figures 7A-1 thru 
71-2 if these amino acids differed from the amino acids of 
both adjacent isolates. Figure 7J-1 shows the alignment 

35 of the consensus sequences of major genotypes 1 (Figures 
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7C-1 thru 7C-4), 2 (Figure 7F-1) , 3 (Figure 7G-1) , 4 
(Figures 7H-1 and 7H-2) , 5 (Figures 71-1 and 71-2) and 6 
(SEQ ID NO: 154) to produce a consensus sequence for all 
genotypes and Figure 7K-1 shows the alignment of the 
consensus sequences of Figures 7A-1 and 7A-2, 7B-1 and 7B- 
5 2, 7D-1, 7E-1, 7G-1 and 71-1 and 71-2 with SEQ ID NO: 186 
(genotype 2c), SEQ ID NO:191 (genotype 4a), SEQ ID NO:193 
(genotype 4b), SEQ ID NO:195 (genotype 4c), SEQ ID NO:197 
(genotype 4d) , SEQ ID NO:194 (genotype 4e) , SEQ ID NO:192 
(genotype 4f) and SEQ ID NO: 206 (genotype 6a) to produce a 
10 consensus sequence for all fourteen genotypes. The amino 
acids shown in capital letters in the consensus sequences 
shown in Figures 7J-1 and 7K-1 are conserved among all 
genotypes while the amino acids shown in lower case 
letters represent amino acids found most frequently in the 
15 sequences aligned to produce this consensus sequence. In 
addition, the presence of a hyphen at an amino acid 
position in all fourteen sequences aligned in Figure 7K-1 
indicates that the amino acid found at that position in 
the aligned sequences is the same as the amino acid shown 
20 at the corresponding position in the consensus sequence of 
Figure 7K-1. 

Figures 8A and 8B show phylogenetic trees 
illustrating the calculated evolutionary relationships of 
the different HCV isolates based upon the C gene sequence 
25 of 52 HCV isolates (Figure 8A) and the El gene sequence of 
51 HCV isolates (Figure 8B) , respectively. The 
phylogenetic trees were constructed by the unweighted 
pair-group method with arithmetic mean (Nei, M. (1987) 
Molecular Evolutionary Genetics (Columbia University 
30 Press, New York, N.Y.), pp 287-326) using the computer 
software package "Gene Works" from IntelliGenetics . The 
lengths of the horizontal lines connecting the sequences, 
given in absolute values. from 0 to 1, are proportional to 
the estimated genetic distances between the sequences . 
35 Genotype designations of HCV isolates are indicated. In 45 
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° HCV isolates, both the C and the El gene sequences were 
determined. 

Detailed Description Of Invention 
The present invention relates to cDNAs encoding 
5 . the complete nucleotide sequence of the envelope 1 (El) 
and core genes of isolates of human hepatitis C virus 
(HCV) . The El cDNAs of the present invention were 
obtained as follows. Viral RNA was extracted from serum 
. collected from humans infected with hepatitis C virus and 
10 the viral RNA was then reverse transcribed and amplified 
by polymerase chain reaction using primers deduced from 
the sequence of the HCV strain H-77 (Ogata, N. et al . 
(1991) Proc. Natl. Acad. Sci . U.S.A. 88:3392-3396). The 
amplified cDNA was then isolated by gel electrophoresis 
15 and sequenced. 

The present invention further relates to the 
nucleotide sequences of the cDNAs encoding the El gene of 
51 HCV isolates. These nucleotide sequences are shown in 
the sequence listing as SEQ ID NO:l through SEQ ID NO:51. 
20 The abbreviations used for the nucleotides are 

those standardly used in the art. 

The deduced amino acid sequence of each of SEQ 
ID NO:l through SEQ ID NO: 51 are presented in the sequence 
listing as SEQ ID NO: 52 through SEQ ID NO: 102 where the 
25 amino acid sequence in SEQ ID NO : 52 is deduced from the 
nucleotide sequence shown in SEQ ID NO:l, the amino acid 
sequence shown in SEQ ID NO: 53 is deduced from the 
nucleotide sequence shown in SEQ ID NO: 2 and so on. The 
deduced amino acid sequence of each of SEQ ID Nos : 52-102 
30 starts at nucleotide 1 of the corresponding nucleic acid 
sequence shown in SEQ ID NOs: 1-51 and extends 575 
nucleotides to a total . length of 576 nucleotides. 

The three letter abbreviations used in SEQ ID 
Nos: 52-102 follow the conventional amino acid shorthand 
35 for the twenty naturally occurring amino acids.. 
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The present invention also relates to the 
nucleotide sequences of the cDNAs encoding the core gene 
of 52 HCV isolates. These nucleotide sequences are shown 
in the sequence listing as SEQ ID. NO: 103 through SEQ ID 
NO: 154 . 

5 The core cDNAs of the present invention were 

obtained as follows. Viral RNA was extracted from serum 
and reversed transcribed as described above for cloning of 
the El cDNAs. The core cDNAs of the present invention 
were then amplified by polymerase chain reaction using 

10 primers deduced from previously determined sequences that 
flank the core gene (Bukh et al . (1992)) Proc . Natl. Acad. 
Sci. U.S.A. , 89: 4942-4946; Bukh et al . (1993) Proc. Natl. 
Acad. Sci . U.S.A. . 90: 8234-8238).. 

The deduced amino acid sequence of each of SEQ 

15 ID NO: 103 through SEQ ID NO: 154 are presented in the 

sequence listing as SEQ ID NO: 155 through SEQ ID NO: 206 
where the amino acid sequence in SEQ ID NO: 155 is deduced 
from the nucleotide sequence shown in SEQ ID NO: 103, the 
amino acid sequence shown in SEQ ID NO: 156 is deduced from 

20 the nucleotide sequence shown in SEQ ID NO: 104 and so on. 
The deduced amino acid sequence of each of SEQ ID NOs: 
155-206 starts at nucleotide 1 of the corresponding 
nucleotide sequence shown in SEQ ID NOs: 103-154 and 
extends 572 nucleotides to a total length of 573 

25 nucleotides. 

Preferably, the El and core proteins and 
peptides of the present invention are substantially 
homologous to, and most preferably biologically equivalent 
to, native HCV El and core proteins and peptides. By 

30 "biologically equivalent" as used throughout the 
specification and claims, it is meant that the 
compositions are immunogenically equivalent to the native 
El and core proteins and peptides. The El and core 
proteins and peptides of the present invention may also 

35 stimulate the production of protective antibodies upon 
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° injection into a mammal that would serve to protect the 
mammal upon challenge with HCV. By "substantially 
homologous" as used throughout the ensuing specification 
and claims to describe El and core proteins and peptides, 
it is meant a degree of homology in the amino acid 
5 sequence of the El and core proteins and peptides to the 
native El and core proteins and peptides respectively. 
Preferably the degree of homology is in excess of 90, 
preferably in excess of 95, with a particularly preferred 
group of proteins being in excess of 99 homologous with 

10 the native El or core proteins and peptides. 

Variations are contemplated in the cDNA 
sequences shown in SEQ ID NO:l through SEQ ID NO: 51 and in 
SEQ ID NO:103 through SEQ ID NO:154 which will result in a 
nucleic acid sequence that is capable of directing 

15 production of analogs of the corresponding protein shown 

in SEQ ID NO:52 through SEQ ID NO:102 and in SEQ ID NO:155 
through SEQ ID NO: 206. It should be noted that the cDNA 
sequences set forth above represent a preferred embodiment 
of the present invention. Due to the degeneracy of the 

20 genetic code, it is to be understood that numerous choices 
of nucleotides may be made that will lead to a DNA 
sequence capable of directing production of the instant 
protein or its analogs. As such, DNA sequences which are 
functionally equivalent to the sequence set forth above or 

25 which are functionally equivalent to sequences that would 
direct production of analogs of the El and core proteins 
produced pursuant to the amino acid sequences set forth 
above, are intended to be encompassed within the present 
invention. 

30 The term analog as used throughout the 

specification or claims to describe the El and core 
proteins and peptides of the present invention, includes 
any protein or peptide having an amino acid residue 
sequence substantially identical to a sequence 

35 specifically shown herein in which one or more residues 
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have been conservatively substituted with a biologically- 
equivalent residue. Examples of conservative 
substitutions include the substitution of one polar 
(hydrophobic) residue such as isoleucine, valine, leucine 
or methionine for another, the substitution of one polar 
5 (hydrophilic) residue for another such as between arginine 
and lysine, between glutamine and asparagine, between 
glycine and serine, the substitution of one basic residue 
such as lysine, arginine or histidine for another, or the 
substitution of one acidic residue, such as aspartic acid 
10 or glutamic acid for another. 

The phrase "conservative substitution" also 
includes the use of a chemically derivatized residue in 
place of a non-derivatized residue provided that the 
resulting protein or peptide is biologically equivalent to 
15 the native El or core protein or peptide. 

"Chemical derivative" refers to an El or core 
protein or peptide having one or more residues chemically 
, derivatized by reaction of a functional side group. 

Examples of such derivatized molecules, include but are 

20 not limited to, those molecules in which free amino groups 
have been derivatized to form amine hydrochlorides, p- 
toluene sulfonyl groups, carbobenzoxy groups, t- 
butyloxycarbonyl groups, chloracetyl groups or formyl 
groups. Free carboxyl groups may be derivatized to form 

25 salts, methyl and ethyl esters or other types of esters or 
hydrazides. Free hydroxyl groups may be derivatized to 
form O-acyl or O-alkyl derivatives. The imidazole 
nitrogen of histidine may be derivatized to form N- 
imbenzylhistidine . Also included as chemical derivatives 

30 are those proteins or peptides which contain one or more 
naturally-occurring amino acid derivatives of the twenty 
standard amino acids. For examples: 4 -hydroxyproline may 
be substituted for proline; 5 -hydroxy lysine may be 
substituted for lysine; 3 -methylhistidine may be 

35 - substituted for histidine; homoserine may be substituted 
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° for serine; and ornithine may be substituted for lysine. 
The El and core proteins and peptide of the present 
invention also includes any protein or peptide having one 
or more additions and/or deletions of residues relative to 
the sequence of a peptide whose sequence is shown herein, 
5 so long as the peptide is biologically equivalent to the 
native El or core protein or peptide. 

The present invention also includes a 
recombinant DNA method for the manufacture of HCV El and 
core proteins. In this method, natural or synthetic 
10 nucleic acid sequences may be used to direct the 
production of El and core proteins. 

. - .. In one embodiment of the invention, the method 

comprises: 

(a) preparation of a nucleic acid sequence 

15 capable of directing a host organism to produce HCV El or 
core protein; 

(b) cloning the nucleic acid sequence into a 
vector capable of being transferred into and replicated in 
a host organism, such vector containing operational 

20 elements for the nucleic acid sequence; 

(c) transferring the vector containing the 
nucleic acid and operational elements into a host organism 
capable of expressing the protein; 

(d) culturing the host organism under 

25 conditions appropriate for amplification of the vector and 
expression of the protein; and 

(e) harvesting the protein. 

In another embodiment of the invention, the 
method for the recombinant DNA synthesis of an HCV El 
30 protein encoded by any one of the nucleic acid sequences 
shown in SEQ ID NOs:l-51 comprises: 

(a) culturing a transformed or transfected host 
organism containing a nucleic acid sequence capable of 
directing the host organism to produce a protein, under 
35 conditions such that -the protein is produced, said protein 
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exhibiting substantial homology to a native El protein 
isolated from HCV having the amino acid sequence according 
to any one of the amino acid sequences shown in SEQ ID 
NOs : 52-102 or combinations thereof. 

In one embodiment, the RNA sequence of an HCV 
5 isolate was isolated and converted to cDNA as follows. 

Viral RNA is extracted from a biological sample collected 
from human subjects infected with hepatitis C and the 
viral RNA is then reverse transcribed and amplified by 
polymerase chain reaction using primers deduced from the 
10 sequence of HCV strain H-77 (Ogata et al . (1991)). 

Preferred primer sequences are shown as SEQ ID NOs: 207-212 
in the sequence listing. Once amplified, the PCR 
fragments are isolated by gel electrophoresis and 
sequenced. 

15 In an alternative embodiment, the above method 

may be utilized for the recombinant DNA synthesis of an 
HCV core protein encoded by any one of the nucleic acid 
sequences shown in SEQ ID NOS: 103-154, where the protein 
produced by this method exhibits substantial homology to a 

20 native core protein isolated from HCV having amino acid 

sequence according to any one of the amino acid sequences 
shown in SEQ ID NOS: 155-206 or combinations thereof. 

The vectors contemplated for use in the present 
invention include any vectors into which a nucleic acid 

25 sequence as described above can be inserted, along with 

any preferred or required operational elements, and which 
vector can then be subsequently transferred into a host 
organism and replicated in such organisms. Preferred 
vectors are those whose restriction sites have been well 

30 documented and which contain the operational elements 
preferred or required for transcription of the nucleic 
acid sequence . 

The "operational elements" as discussed herein 
include at least one promoter, at least one operator, at 

35 least one leader sequence, at -least one terminator codon, 
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° and any other DNA sequences necessary or preferred for 
appropriate transcription and subsequent translation of 
the vector nucleic acid. In particular, it is 
contemplated that such vectors will contain at least one 
origin of replication recognized by the host organism 
5 along with at least one selectable marker and at least one 
promoter sequence capable of initiating transcription of 
the nucleic acid sequence. 

In construction of the recombinant expression 
vectors of the present invention, it should additionally 

10 be noted that multiple copies of the nucleic acid sequence 
of interest (either El or core) and its attendant 
operational elements may be inserted into each vector. In 
such an embodiment , the host organism would produce 
greater amounts per vector of the desired El or core 

15 protein. The number of multiple copies of the nucleic 
acid sequence which may be inserted into the vector is 
limited only by the ability of the resultant vector due to 
its size, to be transferred into and replicated and 
transcribed in an appropriate host microorganism. 

20 Of course, those skilled in the art would 

readily understand that copies of both core and El nucleic 
acid sequence may be inserted into single vector such that 
a host organism transformed or transfected with said 
vector would produce both the desired El and core 

25 proteins. For example, a polysistronic vector in which 
multiple different El and/or core proteins may be 
expressed from a single vector is created by placing 
expression of each protein under control of an internal 
ribosomal entry site (IRES) (Molla, A. et al . Nature , 

30 356:255-257 (1992); Gong, S.K. et al . J. of Virol . . 
263 :1651-1660 (1989) ) . 

In another embodiment, restriction digest 
fragments containing a coding sequence for El or core 
proteins can be inserted into a suitable expression vector 

35 that functions in prokaryotic or eukaryotic cells. By 
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suitable is meant that the vector is capable of carrying 
and expressing a complete nucleic acid sequence coding for 
an El or core protein. Preferred expression vectors are 
those that function in a eukaryotic cell. Examples of 
such vectors include but are not limited to vaccinia virus 
5 vectors, adenovirus or herpes viruses. A preferred vector 
is the baculovirus transfer vector, pBlueBac. 

In yet another embodiment, the selected 
recombinant expression vector may then be transfected into 
a suitable eukaryotic cell system for purposes of 
10 expressing the recombinant protein. Such eukaryotic cell 
systems include but are not limited to cell lines such as 
HeLa, MRC-5 or CV-l. A preferred eukaryotic cell system 
is SF9 insect cells. 

The expressed recombinant protein may be 
detected by methods known in the art including, but not 
limited to, Coomassie blue staining and Western blotting. 

The present invention also relates to 
substantially purified and isolated recombinant El and 
core proteins. In one embodiment, the recombinant protein 
expressed by the SF9 cells can be obtained as a crude 
lysate or it can be purified by standard protein 
purification procedures known in the art which may include 
differential precipitation, molecular sieve 
chromatography, ion-exchange chromatography, isoelectric 
focusing, gel electrophoresis and affinity and 
immunoaf f inity chromatography. The recombinant protein 
may be purified by passage through a column containing a 
resin which has bound thereto antibodies specific for the 
open reading frame (ORF) protein. 
30 The present invention further relates to the use 

of recombinant El and core proteins as diagnostic agents 
and vaccines. In one embodiment, the expressed 
recombinant proteins of this invention can be used in 
immunoassays for diagnosing or prognosing hepatitis C in a 
35 mammal. For the purposes of the present invention, 
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° "mammal" as used throughout the specification and claims, 
includes, but is not limited to humans, chimpanzees, other 
. primates and the like. In a preferred embodiment, the 
immunoassay is useful in diagnosing hepatitis C infection 
in humans . 

5 Immunoassays of the present invention may be 

those commonly used by those skilled in the art including, 
but not limited to, radioimmunoassay, Western blot assay, 
immunof luorescent assay, enzyme immunoassay, 
chemiluminescent assay, immunohistochemical assay, 

10 immunoprecipitation and the like. Standard techniques 
known in the art for ELISA are described in Methods in 
Immunodiaanosis , 2nd Edition, Rose and Bigazzi , eds . , John 
Wiley and Sons, 1980 and Campbell et al . , Methods of 
Immunology . W.A. Benjamin, Inc., 1964, both of which are 

15 incorporated herein by reference. Such assays may be a 
direct, indirect, competitive, or noncompetitive 
immunoassay as described in the art (Oellerich, M. 1984. 
J. Clin. Chem. Clin. BioChem 22:895-904) Biological 
samples appropriate for such detection assays include, but 

20. are not limited to serum, liver, saliva, lymphocytes or 
other mononuclear cells. 

In a preferred embodiment, test serum is reacted 
with a solid phase reagent having surf ace -bound 
recombinant HCV El and/or core protein (s) as antigen (s) . 

25 The solid surface reagent can be prepared by known 
techniques for attaching protein to solid support 
material. These attachment methods include non-specific 
adsorption of the protein to the support or covalent 
attachment of the protein to a reactive group on the 

30 support. After reaction of the antigen with anti-HCV 

antibody, unbound serum components are removed by washing 
and the antigen -antibody complex is reacted with a 
secondary antibody such as labelled anti-human antibody. 
The label may be an enzyme which is detected by incubating 

35 the solid support in the presence of a suitable 
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fluorimetric or calorimetric reagent. Other detectable 
labels may also be used, such as radiolabels or colloidal 
gold, and the like. 

The HCV El and/or core proteins and analogs 
thereof may be prepared in the form of a kit, alone, or in 
5 combinations with other reagents such as secondary 
antibodies, for use in immunoassays. 

In yet another embodiment the recombinant El and 
core proteins or analogs thereof can be used as a vaccine 
to protect mammals against challenge with hepatitis C. 

10 The vaccine, which acts as an immunogen, may be a cell, 
cell lysate from cells transfected with a recombinant 
expression vector or a culture supernatant containing the 
expressed protein. Alternatively, the immunogen is a 
partially or substantially purified recombinant protein. 

15 In yet another embodiment, the immunogen may be a fusion 
protein comprising core protein and a second, non-core 
protein joined together such that the core portion of the 
fusion protein will aggregate and "trap" the second 
protein on the surface of the particle produced by 

20 aggregation of the core protein. (Molecular Biology of 
the Hepatitis B Virus", McLachlan, A. (1991) CRC Press, 
Boca Raton, Fla.). Alternatively, the core protein could 
be mixed with the second protein in vitro to produce 
particles in which all or part of the second protein was 

25 exposed on the surface of the particle. Such particles 
would then serve as a carrier in a multi-valent vaccine 
preparation. Second proteins or parts thereof which could 
be mixed with or fused to the core protein include, but 
are not limited to, HCV El and hepatitis B surface 

30 antigen. 

While it is possible for the immunogen to be 
administered in a pure or substantially pure form, it is 
preferable to present it as a pharmaceutical composition, 
formulation or preparation. 
35 The formulations of the present invention, both 
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° for veterinary and for human use, comprise an immunogen as 
described above, together with one or more 

pharmaceutically acceptable carriers and optionally other 
therapeutic ingredients- The carrier(s) must be 
"acceptable" in the sense of being compatible with the 
5 other ingredients of the formulation and not deleterious 
to the recipient thereof. The formulations may 
conveniently be presented in unit dosage form and may be 
prepared by any method well-known in the pharmaceutical 
art . 

10 All methods include the step of bringing into 

association the active ingredient with the carrier which 
constitutes one or more accessory ingredients. In 
general, the formulations are prepared by uniformly and 
intimately bringing into association the active ingredient 

15 with liquid carriers or finely divided solid carriers or 

both, and then, if. necessary, shaping the product into the 
desired formulation.. 

Formulations suitable for intravenous 
intramuscular, subcutaneous, or. intraperitoneal 

20 administration conveniently comprise sterile aqueous 

solutions of the active ingredient with solutions which 
are preferably isotonic with the blood of the recipient. 
Such formulations may be conveniently prepared by 
dissolving the solid active ingredient in water 

25 containing physiologically compatible substances such as 
sodium chloride (e.g. ,0.1-2.0m), glycine, and the like, 
and having a buffered pH compatible with physiological 
conditions to produce an aqueous solution, and rendering 
said solution sterile.. These may be present in unit or 

30 mult i -dose containers, for example, sealed ampules or 
vials. 

The formulations of the present invention may 
incorporate a stabilizer. Illustrative stabilizers are 
preferably incorporated in an amount of 0.10-10, 000 parts 
35 by weight per part by weight of immunogens . If two or 
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more stabilizers are to be used, their total amount is 
preferably within the range specified above. These 
stabilizers are used in aqueous solutions at the 
appropriate concentration and pH. The specific osmotic 
pressure of such aqueous solutions is generally in the 
range of 0.1-3.0 osmoles, preferably in the range of 0.8- 
1.2. The pH of the aqueous solution is adjusted to be 
within the range of 5.0-9.0, preferably within the range 
of 6-8. In formulating the immunogen of the present 
invention, an ant i -adsorption agent may be used. 

Additional pharmaceutical methods may be 
employed to control the duration of action. Controlled 
release preparations may be achieved through the use of 
polymer to complex or adsorb the proteins or their 
derivatives. The controlled delivery may be exercised by 
selecting appropriate macromolecules (for example 
polyester, polyamino acids, polyvinyl pyrrol idone, 
ethylenevinylacetate , methylcellulose , 

carboxymethylcellulose, or protamine sulfate) and the 
concentration of macromolecules as well as the methods of 
incorporation in order to control release. Another 
possible method to control the duration of action by 
controlled-release preparations is to incorporate the 
proteins, protein analogs or their functional derivatives, 
into particles of a polymeric material such as polyesters, 
polyamino acids, hydrogels, poly (lactic acid) or ethylene 
vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is 
possible to entrap these materials in microcapsules 
prepared, for example, by coacervation techniques or by 
30 interfacial polymerization, for example, 

hydroxymethylcellulose or gelatin-microcapsules and poly 
(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, 
albumin microspheres, microemulsions, nanopar tides, and 
35 nanocapsules or in macroemulsions . - 
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° When oral preparations are desired, the 

compositions may be combined with typical carriers, such 
as lactose, sucrose, starch, talc, magnesium stearate, 
crystalline cellulose, methyl cellulose, carboxymethyl 
cellulose, glycerin, sodium alginate or gum arabic among 

5 others . 

The El and core proteins of the present 
invention may also be used as a delivery system for anti- 
virals to prevent or attenuate HCV infection in a mammal 
by utilizing the property of both proteins to self- 

10 aggregate in vitro to "trap" the antiviral within the 
particles produced via aggregation of the core and El 
proteins. Examples of anti-virals which could be 
delivered by such a system include, but are not limited to 
antisense DNA or RNAs . 

15 Vaccination can be conducted by conventional 

methods. For example, the immunogen or immunogens (e.g. 
the El protein may be administered alone or in combination 
with the El proteins derived from other isolates of HCV) 
can be used in a suitable diluent such as saline or water, 

20 or complete or incomplete adjuvants. Further, the 

immunogen (s) may or may not be bound to a carrier to make 
the protein (s) immunogenic. Examples of such carrier 
molecules include but are not limited to bovine serum 
albumin (BSA) , keyhole limpet hemocyanin (KLH) , tetanus 

25 toxoid, and the like. The immunogen (s) can be 

administered by any route appropriate for antibody 
production such as intravenous, intraperitoneal, 
intramuscular, subcutaneous, and the like. The 
immunogen (s) may be administered once or at periodic 

30 intervals until a significant titer of ant i- HCV antibody 
is produced. The antibody may be detected in the serum 
using an immunoassay. 

In yet another embodiment, the immunogen may be 
nucleic acid sequence capable of directing host organism 

35 synthesis of El and/or core protein (s). Such, nucleic acid 
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sequence may be inserted into a suitable expression vector 
by methods known to those skilled in the art. Expression 
vectors suitable for producing high efficiency gene 
transfer in vivo include retroviral, adenoviral and 
vaccinia viral vectors. Operational elements of such 
expression vectors are disclosed previously in the present 
specification and are known to one skilled in the art. 
Such expression vectors can be administered intravenously, 
intramuscularly, subcutaneous ly , intraperitoneally or 
orally. : 

In an alternative embodiment, direct gene 
transfer may be accomplished via intramuscular injection 
of, for example, plasmid-based eukaryotic expression 
vectors containing a nucleic acid sequence capable of 
directing host organism synthesis of El and/or core 
15 protein (s). Such an approach has previously been utilized 
to produce the hepatitis B surface antigen in vivo and 
resulted in an antibody response to the surface antigen 
(Davis, H.L. et al . (1993) Human molecular Genetics . 
2:1847-1851; see also Davis et al . (1993) Human Gene 
20 Therapy, 4:151-159 and 733-740). 

Doses of El and/or core protein ( s) -encoding 
nucleic acid sequence effective to elicit a protective 
antibody response against HCV infection range from about 1 
to about 500 ng . A more preferred range being about 1 to 
25 about 500 ng 

The El and/or core proteins and expression 
vectors containing a nucleic acid sequence capable of 
directing host organism synthesis of El and/or core 
protein(s) may be supplied in the form of a kit, alone, or 
in the form of a pharmaceutical composition as described 
above . 

The administration of the immunogen(s) of the 
present invention may be for either a prophylactic or 
therapeutic purpose, when provided prophylactically, the 
immunogen(s) is provided in advance of any exposure to HCV 
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° or in advance of any symptom of any symptoms due to HCV 
infection. The prophylactic administration of the 
immunogen serves to prevent or attenuate any subsequent 
infection of HCV in a mammal. When provided 
therapeutically, the immunogen (s) is provided at (or 
5 shortly after) the onset of the infection or at the onset 
of any symptom of infection or disease caused by HCV. The 
therapeutic administration of the immunogen (s) serves to 
attenuate the infection or disease* 

In addition to use as a vaccine, the 

10 compositions can be used to prepare antibodies to HCV El 

and core proteins. The antibodies can be usied directly as 
antiviral agents or they may be used in immunoassays 
disclosed herein to detect HCV El and core proteins 
present in patient sera.. To prepare antibodies , a host 

15 animal is immunized using the El and/or core proteins 
native to the virus particle bound to a carrier as 
described above for vaccines. The host serum or plasma is 
collected following an appropriate time interval to 
provide a composition comprising antibodies reactive with 

20 the El or core protein of the virus particle. The gamma 
globulin fraction or the IgG antibodies can be obtained, 
for example, by use of saturated ammonium sulfate or DEAE 
Sephadex, or other techniques known to those skilled in 
the art. The antibodies are substantially free of many of 

25 the adverse side effects which may be associated with 
other ant i -viral agents such as drugs. 

The antibody compositions can be made even more 
compatible with the host system by minimizing potential 
adverse immune system responses. This is accomplished by 

30 removing all or a portion of the Fc portion of a foreign 
species antibody or using an antibody of the same species 
as the host animal, for example, the use of antibodies 
from human/human hybridomas. Humanized antibodies (i.e., 
nonimmunogenic in a human) may be produced, for example, 

35 by replacing an immunogenic portion of an antibody with a 
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corresponding, but nonimmunogenic portion (i.e., chimeric 
antibodies) . Such chimeric antibodies may contain the 
reactive or antigen-binding portion of an antibody from 
one species and the Fc portion of an antibody 
(nonimmunogenic) from a different species. Examples of 
5 chimeric antibodies, include but are not limited to, non- 
human mammal -human chimeras, rodent -human chimeras, 
murine-human and rat-human chimeras (Robinson et al . , 
International Patent Application 184,187; Taniguchi M., 
European Patent Application 171,496; Morrison et al . , 
10 European Patent Application 173,494; Neuberger et al . , PCT 
Application WO 86/01533; Cabilly et al . , 1987 Proc. Natl. 
Acad. Sci. USA 84:3439; Nishimura et al . , 1987 Cane. Res. 
47:999; Wood et al . , 1985 Nature 314:446; Shaw et al . , 
1988 J. Natl. Cancer Inst. 80:15553, all incorporated 
15 herein by reference) . 

General reviews of "humanized" chimeric 
antibodies are provided by Morrison S., 1985 Science 
229:1202 and by Oi et al . , 1986 BioTechniques 4:214. 

Suitable "humanized" antibodies can be 
20 alternatively produced by CDR or CEA substitution (Jones 
et al., 1986 Nature 321:552; Verhoeyan et al . , 1988 
Science 239:1534; Biedleret al . 1988 J. Immunol. 141:4053, 
all incorporated herein by reference) . 

The antibodies or antigen binding fragments may 
25 also be produced by genetic engineering. The technology 
for expression of both heavy and light cain genes in E. 
cali is the subject of the PCT patent applications; 
publication number WO 901443, WO901443, and WO 9014424 and 
in Huse et al., 1989 Science 246:1275-1281. 
30 The antibodies can also be used as a means of 

enhancing the immune response. The antibodies can be 
administered in amount similar to those used for other 
therapeutic administrations of antibody. For example, 
normal immune globulin is administered at 0.02-0.1 ml/lb 
35 body weight during the early incubation period of other 
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° viral diseases such as rabies, measles, and hepatitis B to 
interfere with viral entry into cells. Thus, antibodies 
reactive with the HCV El and/or core proteins can be 
passively administered alone or in conjunction with 
another ant i -viral agent to a host infected with an HCV to 
5 enhance the immune response and/or the effectiveness of an 
antiviral drug. 

Alternatively, anti-HCV El antibodies and anti- 
HCV core antibodies can be induced by administered anti- 
idiotype antibodies as immunogens. Conveniently, a 

10 purified anti-HCV El or anti-HCV core antibody preparation 
prepared as described above is used to induce anti- 
idiotype antibody in a host animal, the composition is 
administered to the host animal in a suitable diluent. 
Following administration, usually repeated administration, 

15 the host produces anti-idiotype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies produced 
by the same species as the host animal can be used or the 
Fc region of the administered antibodies can be removed. 
Following induction of anti-idiotype antibody in the host 

20 animal, serum or plasma is removed to provide an antibody 
composition. The composition can be purified as described 
above for anti-HCV El and anti-HCV core antibodies, or by 
affinity chromatography using anti-HCV El or anti-HCV core 
antibodies bound to the affinity matrix. The anti- 

25 idiotype antibodies produced are similar in conformation 

to the authentic HCV El or core protein and may be used to 
prepare an HCV vaccine rather than using an HCV El or core 
protein. 

When used as a means of inducing anti-HCV virus 
30 antibodies in an animal, the manner of injecting the 

antibody is the same as for vaccination purposes, namely 
intramuscularly, intraperitoneally , subcutaneously or the 
like in an effective concentration in a physiologically 
suitable diluent with or without adjuvant. One or more 
35 booster injections may be desirable. 
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The HCV El and core proteins of the invention 
are also intended for use in producing antiserum designed 
for pre- or post-exposure prophylaxis. Here an El or core 
protein, or mixture of El and/or core proteins is 
formulated with a suitable adjuvant and administered by 
injection to human volunteers, according to known methods 
for producing human antisera. Antibody response to the 
injected proteins is monitored, during a several -week 
period following immunization, by periodic serum sampling 
to detect the presence of ant i -HCV El and/or ant i -HCV core 
serum antibodies, using an immunoassay as described 
herein. 

The antiserum from immunized individuals may be 
administered as a pre-exposure prophylactic measure for 
individuals who are at risk of contracting infection. The 
antiserum is also useful in treating an individual post- 
exposure, analogous to the use of high titer antiserum 
against hepatitis B virus for post-exposure prophylaxis. 

For both in vivo use of antibodies to HCV virus - 
like particles and proteins and anti -idiotype antibodies 
and diagnostic use, it may be preferable to use monoclonal 
antibodies. Monoclonal anti-HCV El and anti-HCV core 
protein antibodies or anti-idiotype antibodies can be 
produced as follows. The spleen or lymphocytes from an 
immunized animal are removed and immortalized or used to 
prepare hybridomas by methods known to those skilled in 
the art. (Goding, J.w. 1983. Monoclonal Antibodies: 
Principles and Practice, Pi adermic Press, Inc., NY, NY, 
pp. 56-97). To produce a human -human hybridoma, a human 
lymphocyte donor is selected. A donor known to be 
30 infected with HCV (where infection has been shown for 

example by the presence of anti-virus antibodies in the 
blood or by virus culture) may serve as a suitable 
lymphocyte donor. Lymphocytes can be isolated from a 
peripheral blood sample or spleen cells may be used if the 
35 donor is subject to splenectomy. Epstein-Barr virus (EBV) 
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° can be used to immortalize human lymphocytes or a human 
fusion partner can be used to produce human-human 
hybridomas . Primary in vitro immunization with peptides 
can also be used in the generation of human monoclonal 
antibodies. 

5 Antibodies secreted by the immortalized cells 

are screened to determine the clones that secrete 
antibodies of the desired specificity. For monoclonal 
anti-El and anti-core antibodies, the antibodies must bind 
to HCV El and core proteins respectively. For monoclonal 

10 anti-idiotype antibodies, the antibodies must bind to 
anti-El and anti-core protein antibodies respectively. 
Cells producing antibodies of the desired specificity are 
selected. 

The present invention also relates to the use of 

15 single -stranded antisense poly- or oligonucleotides 

derived from nucleotide sequences substantially homologous 
to those shown in SEQ ID NOs:l-51 to inhibit the 
expression of hepatitis C El genes. The present invention 
further relates to the use of single-stranded anti-sense 

20 poly- or oligo-nucleotides derived from nucleotide 

sequences substantially homologous to those shown in SEQ 
ID NOs:103-154 to inhibit the expression of hepatitis C 
core genes. Alternatively, the anti-sense poly- or oligo- 
nucleotides may be complementary to both the El and core 

25 genes and hence, inhibit the expression of both hepatitis 
C El and core genes. By substantially homologous as used 
throughout the specification and claims to describe the 
nucleic acid sequences of the present invention, is meant 
a level of homology between the nucleic acid sequence and 

30 the SEQ ID NOs . referred to in the above sentence. 

Preferably, the level of homology is in excess of 80%, 
more preferably in excess of 90%, with a preferred nucleic 
acid sequence being in excess of 95% homologous with the 
DNA sequence shown in the indicated SEQ ID NO. These 

35 anti-sense poly- or oligonucleotides can be either DNA or 
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RNA. The targeted sequence is typically messenger RNA and 
more preferably, a single sequence required for processing 
or translation of the RNA. The anti-sense poly- or 
oligonucleotides can be conjugated to a polycation such as 
polylysine as disclosed in Lemaitre, M. et al . ((1989) 
5 Proc. Natl. Acad. Sci . USA 84:648-652) and this conjugate 
can be administrated to a mammal in an amount sufficient 
to hybridize to and inhibit the function of the messenger 
RNA. 

The present invention further relates to 

10 multiple computer-generated alignments of the nucleotide 
and deduced amino acid sequences shown in SEQ ID NOs:l- 
206 . Computer analysis of the nucleotide sequences shown 
in SEQ ID NOs:l-51 and 103-154 and of the deduced amino 
acid sequences shown in SEQ ID NOs : 52-102 and 155-206 can 

15 be carried out using commercially available computer 
programs known to one skilled in the art . 

In one embodiment, computer analysis of SEQ ID 
NOs:l-51 by the program GENALIGN ( Intelligenet ics , Inc. 
Mountainview, CA) results in distribution of the 51 HCV El 

20 sequences into twelve genotypes based upon the degree of 
variation of the sequences. For the purposes of the 
present invention, the nucleotide sequence identity of El 
cDNAs of HCV isolates of the same genotype is in the range 
of about 85% to about 100% whereas the identity of El cDNA 

25 sequences of different genotypes is in the range of about 
50% to about 80%. 

The grouping of SEQ ID NOs : 1-51 into twelve HCV 
genotypes is shown below. 



SEP ID NOs: Genotypes 

30 1-8 I/ la 

9-25 Il/lb 

26-29 III/2a 

30-33 IV/2b 

34 2c 

35-39 V/3a 

40 4a 

35 41 4b 
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42-43 4c 
44 4d 
45-50 5a 
51 6a 

For those genotypes containing more than one El 
nucleotide sequence, computer alignment of the constituent 
nucleotide sequences of the genotype was conducted using 
GENALIGN in order to produce a consensus sequence for each 
genotype. These alignments and their resultant consensus 
sequences are shown in Figures 1A-1 thru 1G-5 for the 
seven genotypes (I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 
5a) which comprise more than one nucleotide sequence. 
Further alignment of the consensus sequences of Figures 
1A-1 thru 1G-5 with SEQ ID NO:34 (genotype 2c), SEQ ID 
NO:40 (genotype 4a) , SEQ ID NO:41 (genotype 4b), SEQ ID 
NO: 44 (genotype 4d) and SEQ ID NO: 51 (genotype 6a) 
produces a consensus sequence for all twelve genotypes as 
shown in Figures 1H-1 thru 1H-5. The multiple alignments 
of nucleotide sequences shown in Figures 1A-1 thru 1H-5 
produce consensus sequences which serve to highlight 
regions of homology and non-homology between sequences 
found within the same genotype or in different genotypes 
and hence, these alignments can be used by one skilled in 
the art to design oligonucleotides useful as reagents in 
diagnostic assays for HCV. 

Examples of purified and isolated 
oligonucleotide sequences derived from the consensus 
sequences shown in Figures 1A-1 thru 1H-5 include, but are 
not limited to, SEQ ID NOs:213-239 where these 
oligonucleotides are useful as "genotype -specific" primers 
and probes since these oligonucleotides can hybridize 
specifically to the nucleotide sequence of the El gene of 
HCV isolates belonging to a single genotype. The 
genotype-specificity of the oligonucleotides shown in SEQ 
ID NOs:213-239 is as follows: SEQ ID NOs:213-214 are 
specific for genotype I/la; SEQ ID NOs : 215-216 are 
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specific for genotype Il/lb; SEQ ID NOs:217-218 are 
specific for genotype III/2a; SEQ ID NOs : 219-220 are 
specific for genotype IV/2b; SEQ ID NOs : 221-223 are 
specific for genotype 2c; SEQ ID NOs:224-226 are specific 
for genotype V/3a; SEQ ID NOs : 227-228 are specific for 
5 genotype 4a; SEQ ID NOs : 229-230 are specific for genotype 
4b; SEQ ID NOs : 231-232 are specific for genotype 4c; SEQ 
ID NOs: 233-234 are specific for genotype 4d; SEQ ID 
NOs: 23 5 -236 are specific for genotype 5a and SEQ ID 
NOs: 237-23 9 are specific for genotype 6a. 
*0 In another embodiment, the computer analysis of 

SEQ ID NOs: 103-154 by the program GENALIGN results in 
distribution of the 52 HCV core sequences into 14 
genotypes based upon the degree of variation of the 
sequences . 

!5 The grouping of SEQ ID NOs:103-154 into 14 HCV 

genotypes is shown below. 



20 



SEO ID NOs: 


Genotvoes 


103-108 


I/la 


109-124 


Il/lb 


125-128 


III/2a 


129-133 


IV/2b 


134 


2c 


135-138 


V/3a 


139 


4a 


141 


4b 


143 


4c 


144 


4c 


145 


4d 


142 


4e 


140 


4f 


146-153 


5a 


154 


6a 



These 14 genotypes can be further grouped into 6 
major genotypes designated genotypes 1-6 where genotype 1 
comprises the sequences contained in minor genotypes I/la 
and Il/lb; genotype 2 comprises the sequences contained in 
minor genotypes III/2a, IV/2b and 2c ; genotype 3 comprises 
sequences contained in genotype V/3a; genotype 4 comprises 
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° sequences contained in minor genotypes 4a-4f ; genotype 5 
comprises the sequences contained in genotype 5a and 
genotype 6 comprises the sequence contained in genotype 
6a. Computer alignment of the constituent nucleotide 
sequences of the core cDNAs falling within genotypes I/la, 
5 Il/lb, III/2a, IV/2b, V/3a and 5a, to produce a consensus 
sequence for each of these genotypes is shown in Figures 
6A-1 thru 6A-4 (I/la), 6B-1 thru 6B-10 (Il/lb), 6D-1 thru 
6D-3 (IIl/2a) , 6E-1 thru 6E-4 <IV/2b) , 6G-1 thru 6G-3 
(V/3a) and 61-1 thru 61-4 (5a) . The alignment of the 

10 sequences found in minor genotypes I/la and Il/lb to 

produce a consensus sequence for major genotype 1 is shown 
in Figures 6C-1 thru 6C- 10. The alignment of the 
sequences contained in minor genotypes III/2a, IV/2b and 
2c to produce a consensus sequence for major genotype 2 is 

15 shown in Figures 6F-1 thru 6F-5. The alignment of the 

nucleotide sequences contained in minor genotypes 4a-4f to 
produce a consensus sequence for major genotype 4 is shown 
in Figures 6H-1 thru 6H-4 . Further alignment of the 
consensus sequences shown in Figures 6C-1 thru 6C-10, 6F-1 

20 thru 6F-5,.6G-1 thru 6G-3, 6H-1 thru 6H-4 and 61-1 thru 

61-4 with SEQ ID NO: 154 (genotype 6a/major genotype 6) to 
produce a consensus sequence for all genotypes is shown in 
Figures 6J-1 thru 6 J- 4 and alignment of the consensus 
sequences shown in Figures 6A-1 thru 6A-4, 6B-1 thru 

25 6B-10, 6D-1 thru 6D-3, 6E-1 thru 6E-4, 6G-1 thru 6G-3 and 
61-1 thru 61-4 with SEQ ID NO:139 (genotype 4a), SEQ ID 
N0:141 (genotype 4b), SEQ ID NO:143 (genotype 4c), SEQ ID 
N0:145 (genotype 4d) , SEQ ID NO:142 (genotype 4e) , SEQ ID 
NO: 140 (genotype 4f) and SEQ ID NO: 154 (genotype 6a) to 

30 produce a consensus sequence for all fourteen genotypes is 
shown in Figures 6K-1 and 6K-2. As with the alignments of 
the envelope (El) nucleotide sequences, the consensus 
sequences shown in Figures 6A-1 thru 6K-2 serve to 
highlight regions of homology and non-homology between 

35 sequences found within the same genotype or in different 
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genotypes and hence, can be used by one skilled in the art 
to design oligonucleotides useful as reagents in 
diagnostic assays for HCV. 

For example, purified and isolated 
oligonucleotide sequences derived from the consensus 
5 sequences shown in Figures 6A-1 thru 6K-2 may be useful as 
genotype-specific primers and probes since these 
oligonucleotides can hybridize specifically to the 
nucleotide sequence of the core gene of HCV isolates 
belonging to a given genotype. Examples of regions of the 

10 consensus sequence of the core gene of a given genotype 
from which primers specific for that genotype may be 
deduced include but are not limited to, the nucleotide 
domains shown below for each genotype. The sequence in 
which the indicated nucleotide domains are found are 

15 indicated in parentheses to the right of each genotype. 

Genotype 1 (Consensus Sequ ence of Figures 6C-1 thru 6C-10) 

427-466, 444-483, 447-486 (5' -3', sense) 
505-466, 522-483, 525-486 (5' -3', antisense) 

20 Genotype la ( Consensus Sequence of Figures 6A-1 thru 6A-4) 
141-180, 279-318 (5' -3', sense) 
219-180, 246-207 (5' -3', antisense) 

Genotype lb (Consensus Sequence of Figures 6B-1 thru 6B- 
25 . 10) 

67-106, 127-186, 234-273 (5' -3', sense) 

144-106, 225-186, 311-272, 312-273 (5' -3', antisense) 

Genotype 2 (Consen sus Sequence of Figures 6F-1 thru 6F-5) 

30 153-192, 162-201, 164-203, 168-207, 171-210, 182-221, 192- 

231, 193-232, 302-341 (5' -3', sense) 

231-192, 240-201, 242-203, 246-207, 249-210, 260-221, 270- 
231, 271-232, 380-341 (5' -3', antisense) 

35 Genotype III /2a (Consen sus Sequence of Figures 6D-1 thru 
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6D-3) 

276-315, 306-355 (5' -3', sense) 

309-270, 354-315, 394-355, 571-532 (5' -3', antisense) 

Genotype IV/2b (Consensus Sequence of Figures 6E-1 thru 
5 6E-4) 

6-45, 135-174, 177-216, 309-348, 337-376, 375-414, 501-540 
(5 ' -3 ' , sense) 

84-45, 213-174, 255-216, 387-348, 415-376, 453-414, 571- 
532, 573-540 (5'-3', antisense) 

10 

Genotype 2c (SEP ID NO: 134) 

194-233, 273-312, 279-318, 417-456, 423-462, 504-543, 505- 
544, 517-556 (5' -3', sense) 

272-233, 351-312, 354-315, 357-318, 450-411, 495-456, 501- 
15 462, 573-543, 556-573 (5'-3', antisense) 

Genotype 3 or Genotype V/3a (Consensus Sequence of Figures 
6G-1 thru 6G-3) 

8-47, 45-84, 68-107, 87-126, 88-127, 90-129, 111-150, 142- 
20 181, 173-212, 177-216, 261-300, 

276-315, 452-491, 520-559, 521-560, 529-568, 532-571, 533- 
572. (5 '-3', sense) 

86-47, 123-84, 146-107, 165-126, 186-147, 189-150, 219- 
180, 250-211, 251-212, 255-216, 
25 339-300, 530-491, 573-543, 573-557, 573-559, 573-560. (5'- 
3 ' , antisense) 

Genotype 4 (Consensus Sequence o f Figures 6H-1 thru 6H-4) 
20-59 (5' -3', sense) 
30 97-58, 98-59 (5' -3', antisense) 

Genotype 4a (SEP ID NO: 139) 

111-150, 150-189, 174-213, 183-222, 192-231, 261-300, 376- 
415, 396-435, 531-570 (5' -3', sense) 
35 186-147, 252-213, 270 -231, 339-300, 454-415 (5' -3', . 
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° antisense) 

Genotype 4b (SEP ID NO: 141) 

27-66, 30-69, 106-145, 271-310, 433-472, 447-486, 453-492 
(5' -3 ' , sense) 

5 105-66, 183-144, 184-145, 345-306, 348-309, 349-310, 468- 
429, 510-471, 522-483, 570-531 (5' -3', antisense) 

Genotype 4c (SEP ID NO: 14 3 

174-213, 180-219, 207-246, 231-270 (5' -3', sense) 
10 249-210, 252-213, 258-219, 309-270, 504-465 (5' -3', 
antisense) 

Genotype 4d (SEP ID NO: 14 5) 
173-212, 188-327, 430-469 (5'-3', sense) 
15 248-209, 249-210, 250-211, 251-212, 366-327, 508-469 (5'- 
3 ' , antisense) 

Genotype 4e (SE P ID NO: 142) 

160-199, 267-306, 287-326, 288-327, 524-564 (5'-3', sense) 
20 238-199, 345-306, 365-326, 216-177, 522-483 (5'-3', 
antisense) 

Genotype 4f (SEP id NP:i40) 

18-57, 36-75, 228-267, 396-435 (5'-3', sense) 
25 96-57, 114-75, 306-267 (5' -3', antisense) 

Genotype 5 or 5a (Cons ensus Sequence of Figures 61-1 thru 
61-4) 

176-215, 177-216, 181-220, 195-234, 221-260, 252-291, 255- 
30 294, 396-435, 435-474, 447-486, 498-537 (5' -3', sense) 

254-215, 299-260, 310-271, 330-291, 333-294, 354-315, 464- 
425, 471-432, 483-444, 570-531 (5' -3', antisense) 

Genotyp e 6 or 6a (SEP TP NO: 154) 
35 20-59, 136-175, 156-195, 159-198, 175-214, 185-224, 277- 
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° 316, 278-317, 312-351, 348-387,405-444, 406-445, 407-446, 
408-447, 411-450, 432-471, 433-472, 435-474, 522-561 (5'- 
3 ' , sense) . 

98-59, 214-175, 234-195, 237-198, 253-214, 262-223, 263- 
224, 354-315, 355-316, 382-343, 390-351, 426-387, 468-429, 
5 483^444, 484-445, 485-446, 486-447, 489-450, 510-471, 511- 
472, 513-474 (5'-3', antisense) 

Such nucleotide domains may range from about 15 
to about 100 bases in length with a more preferred range 
being about 30 to about 60 bases in length. 

10 In an alternative embodiment, universal primers 

able to hybridize to the nucleotide sequences of the core 
gene of HCV isolates belonging to all of the genotypes 
disclosed herein may be deduced from universally conserved 
nucleotide domains of the consensus sequence shewn in 

15 Figures 6J-1 thru 6K-2. Examples of such nucleotide 
domains include, but are not limited to, those shown 
below: 

nucleotides 1-20, 1-25, 1-26, 1-27, 1-33, 50-89, 
51-90, 52-91, 53-92, 61-100, 62-101, 77-116, 78-117, 79- 

20 118, 80-119, 81-120, 82-121, 83-122, 84-123, 85-124, 86- 
125, 97-136, 98-137, 99-138, 100-139, 101-140, 102-141, 
329-368, 330-369, 331-370, 332-371, 354-393, 355-394, 356- 
395, 362-401, 363-402, 364-403, 365-404, 369-408, 442-481, 
443-482, 457-496, 458-497, 475-514, 476-515, 477-516 (5'- 

25 3, sense) ; and 

nucleotides 40-1, 41-2, 42-3, 43-4, 51-12, 52- 
13, 55-16, 56-17, 57-18, 58-19, 61-22, 62-23, 63-24, 
64-25, 70-31, 124-85, 125-86, 126-87, 127-88, 128-89, 129- 
90, 136-97, 137-98, 138-99, 

30 149-110, 150-111, 151-112, 152-113, 153-114, 154-115, 155- 
116, 156-117, 157-118, 158-119, 159-120, 170-131, 171-132, 
172-133, 173-134, 174-135, 175-136, 403-364, 405-365, 406- 
366, 406-367, 430-391, 431-392, 432-393, 436-397, 437-398, 
438-399, 439-400, 517-478, 518-479, 519-480, 532-493, 533- 

35 494, 550-511, 551-512 (5' -3', antisense) 
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. ° Those skilled in the art would readily 

understand that the term "antisense" as used herein refers 
to primer sequences which are the complementary sequence 
of the indicated consensus sequence or SEQ ID NO: . 
Further, provided with the above examples of regions of 
5 the consensus sequences or indicated SEQ ID NOS : from 

which to deduce universal and genotype-specific primers, 
those skilled in the art would readily be able to select 
pairs of primers, one sense and one antisense, which would 
be useful in the detection of HCV genotypes via the PCR 
10 methods described herein. 

In yet another embodiment, the sequences shown 
in SEQ ID NO.:103-154 and the resultant consensus 
sequences produced by alignment of these SEQ ID NOs as 
shown in Figures 6A-1 thru 6K-2 may also be useful in the 
design of hybridization probes specific for a given HCV 
genotype. Examples of nucleotide domains of the consensus 
sequence or SEQ ID NO of a given genotype from which 
genotype-specific hybridization probes may be deduced 
include, but are not limited to, those shown below where 
the sequence from which the domains are found is indicated 
in parentheses to the right of each genotype. 

Genotype Position 
la (Consensus sequence of Figures 6A-1 thru 6A-4) 50-85 



15 



20 



25 



155-205 
207-277 
281-333 
429-477 
530-573 

lb (Consensus sequence of Figures 6B-1 thru 6B-10) 81-131 

159-225 
252-318 

30 411-472 

530-573 

2a (Consensus sequence of Figures 6D-1 thru 6D-3) 35-75 

200-276 
290-340 
330-380 
410-472 

35 530-573 
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2b (Consensus sequence of Figures 6E-1 thru 6E-4) 



2c (SEQ ID NO: 134) 



10 



3a (Consensus sequence of Figures 6G-1 thru 6G-3) 



15 



4a (SEQ ID NO: 139) 



20 



4b (SEQ ID NO: 141) 



25 



30 



4c (SEQ ID NO: 143) 



35 



4d (SEQ ID NO: 145) 



20-70 
149-199 
191-241 
240-285 
261-318 
323-373 
351-401 

389- 439 

429- 477 
530-573 

208-258 
230-276 
290-345 
411-460 

430- 490 
530-573 

1-50 
40-100 
100-160 
145-190 
190-240 
275-325 
411-455 
466-516 
530-573 

35-85 
145-195 
200-250 
255-305 
341-390 

390- 440 
530-573 

35-85 
120-170 
180-225 
230-275 
285-335 
405-455 
462-492 
530-573 

35-85 
190-246 
245-295 
282-318 
372-415 
440-480 
530-573 

35-85 
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° 187-237 

302-352 
405-455 
444-494 

530-573 

4e (SEQ ID NO:142) 35-85 
5 57-84 

174-224 
230-275 
290-340 
422-472 
530-573 

4f (SEQ ID NO:140) 35-85 
10 174-224 

242-292 
290-340 
422-472 
530-573 



15 



25 



5a (Consensus sequence of Figures 61-1 thru 61-4) 180-234 

265-315 
315-355 
420-486 
530-573 



6a (SEQ ID NO: 154) 34-84 

150-200 
180-230 

20 230-290 

291-333 
341-395 
429-490 
530-573 



1 (Consensus sequence of Figures 6C-1 thru 6C-10) 192-241 

435-495 



2 (Consensus sequence of Figures 6F-1 thru 6F-5) 186-240 

320-360 
440-475 

4 (Consensus sequence of Figures 6H-1 thru 6H-4) 40-80 

30 In yet another embodiment, universal 

hybridization probes may be derived from the consensus 
sequences shown in Figures 6J-1 thru 6K-2. Examples of 
nucleotide domains of the consensus sequences shown in 
Figures 6J-1 thru 6K-2 from which universal hybridization 

35 probes may be derived include, but are not limited to, 1- 
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33; 85-141; 364-408; 478-516. 

The oligonucleotides of this invention can be 
synthesized using any of the known methods of 
oligonucleotide synthesis (e.g., the phosphodiester method 
of Agarwal et al . 1972, Agnew. Chem. Int. Ed. Engl. 
5 11:451, the phosphotriester method of Hsiung et al . 1979, 
Nucleic Acids Res 6:1371, or the automated 
diethylphosphoramidite method of Baeucage et al . 1981, 
Tetrahedron Letters 22:1859-1862), or they can be isolated 
fragments of naturally occurring or cloned DNA. In 

10 addition, those skilled in the art would be aware that, 
oligonucleotides can be synthesized by automated 
instruments sold by a variety of manufacturers or can be 
commercially custom ordered and prepared. In a preferred 
embodiment, the oligonucleotides of- the. present -invention 

15 are synthetic oligonucleotides. The oligonucleotides of 

the present invention may range from about 15 to about 100 
nucleotides; with the preferred sizes being about 20 to 
about 60 nucleotides; a more preferred size being about 25 
to about 50 nucleotides; and a mo§t preferred size being 

20 about 3 0 to about 4 0 nucleotides. 

The present invention also relates to methods 
for detecting the presence of HCV in a mammal, said 
methods comprising analyzing the RNA of a mammal for the 
presence of hepatitis C virus. 

25 The RNA to be analyzed can be isolated from 

serum, liver, saliva, lymphocytes or other mononuclear 
cells as viral RNA, whole cell RNA or as poly (A) + RNA. 
Whole cell RNA can be isolated by methods known to those 
skilled in the art. Such methods include extraction of 

30 RNA by differential precipitation (Birnbiom, H.C. (1988) 
Nucleic Acids Res., 16:1487-14 97), extraction of RNA by 
organic solvents (Chomczynski , P. et al . (1987) Anal. 
Biochem., 162:156-159) and extraction of RNA with strong 
denaturants (Chirgwin, J.M. et al . (1979) Biochemistry, 

35 18:5294-5299). Poly(A)* RNA can be selected from whole 
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cell RNA by affinity chromatography on oligo-d(T) columns 
(Aviv, H. et al. (1972) Proc . Natl. Acad. Sci . , 69:1408- 
1412) . A preferred method of isolating RNA is extraction 
of viral RNA by the guanidinium-phenol -chloroform method 
of Bukh et al . (1992a). 
5 The methods for analyzing the RNA for the 

presence of HCV include Northern blotting (Alwine, J.C. et 
al. (1977) Proc. Natl. Acad. Sci., 74:5350-5354), dot and 
slot hybridization (Kafatos, F.C. et al . (1979) Nucleic 
Acids Res., 7:1541-1522), filter hybridization (Hollander, 
10 M.C. et al. (1990) Biotechniques; 9:174-179), RNase 
protection (Sambrook, J. et al . (1989) in "Molecular 
Cloning, A Laboratory Manual", Cold Spring Harbor Press, 
Plainview, NY) and reverse- transcription polymerase chain 
reaction (RT-PCR) (Watson, J.D. et al . (1992) in 
15 "Recombinant DNA" Second Edition, W.H. Freeman and 
Company, New York) . 

A preferred method for analyzing the RNA is RT- 
PCR. In this method, the RNA can be reverse transcribed 
to first strand eDNA using a primer or primers derived 

20 from the nucleotide sequences shown in SEQ ID NOs:l-5l or 
SEQ ID NOs:103-154 or sequences complementary to those 
described. Once the cDNAs are synthesized, PCR 
amplification is carried out using pairs of primers 
designed to hybridize with sequences in the HCV El or core 

25 cDNA which are an appropriate distance apart (at least 

about 50 nucleotides) to permit amplification of the cDNA 
and subsequent detection of the amplification product. 
Alternatively, one can amplify both El and core cDNA 
sequences by using a primer pair where one primer 

30 hybridizes with the El cDNA sequence and the other primer 
hybridizes with the core cDNA sequence. Each primer of a 
pair is a single -stranded oligonucleotide of about 20 to 
about 60 bases in length with a more preferred range being 
about 30 to about 50 bases in length where one primer (the 

35 "upstream", primer) is complementary to the original RNA 
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° and the second primer (the "downstream" primer) is 

complementary to the first strand of cDNA generated by 
reverse transcription of the RNA. The target sequence is 
generally about 100 to about 3 00 base pairs long but can 
be as large as 500-1500 base pairs . Optimization of the 
5 amplification reaction to obtain sufficiently specific 
hybridization to the nucleotide sequence of interest 
(either El or core or both El and core) is well within the 
skill in the art and is preferably achieved by adjusting 
the annealing temperature. 

10 In one embodiment, the primer pairs selected to 

amplify El and core cDNAs are universal primers. By 
"universal", as used to describe primers throughout the 
claims and specification, is meant those primer pairs 
which can amplify El and/or core gene fragments derived 

15 from an HCV isolate belonging to any one of the genotypes 
of HCV described herein. Purified and isolated universal 
primers for El cDNAs are used in Example 1 of the present 
invention and are shown as SEQ ID NOs : 207-212 where SEQ ID 
NOs:207 and 208 represent one pair of primers, SEQ ID 

20 NOs:209 and 210 represent a second pair of primers and SEQ 
ID NOs: 211-212 represent a third pair of primers. 
Nucleotide domains of the consensus sequence shown in 
Figures 6J-1 thru 6J-4 from which universal primers for 
core cDNAs may be deduced have previously been disclosed 

25 within the present specification. Alternatively, a 

universal primer for El cDNA sequence and a universal 
primer for core cDNA sequence may be used as a universal 
primer pair to amplify both El and core cDNAs. 

In an alternative embodiment, primer pairs 

30 selected to amplify El and/or core cDNAs are genotype- 
specific primers. In the present invention, genotype- 
specific primer pairs can readily be derived from the 
following genotype-specific El nucleotide domains: 
nucleotides 197-238 and 450-480 of the consensus sequence 

35 of. genotype I/la shown in Figures 1A-1 and 1A-4; 
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nucleotides 197-238 and 450-480 of the consensus sequence 
of genotype Il/lb shown in Figures 1B-4 and 1B-8; 
nucleotides 199-238 and 438-480 of the consensus sequence 
of genotype III/2a shown in Figures 1C-2 and 1C-4; 
nucleotides 124-177 and 450-480 of the consensus sequence 
5 of genotype IV/2b shown in Figures 1D-2 and 1D-4; 

nucleotides 124-177, 193-238 and 436-480 of SEQ ID NO:34 
(genotype 2C) ; nucleotides 168-207, 294-339 and 406-480 of 
the consensus sequence of genotype V/3a shown in Figures 
1E-2, 1E-3 and 1E-4; nucleotides 145-183 and 439-480 of 
10 SEQ ID NO:40 (genotype 4a); nucleotides 168-207 and 432- 

480 of SEQ ID NO: 41 (genotype 4b); nucleotides 130-183 and 
450-480 of the consensus sequence of genotype 4c shown in 
Figures 1F-1 and 1F-2; nucleotides 130-183 and 450-480 of 
SEQ ID NO:44 (genotype 4d) ; nucleotides 166-208 and 437- 
15 480 of the consensus sequence of genotype 5a shown in 

Figures 1G-2 and 1G-4 and nucleotides 168-207, 216-252 and 
429-480 of SEQ ID N0:51 (genotype 6a). Genotype -specific 
HCV core nucleotide domains from which genotype -specific 
primers may be deduced have previously been described 
20 herein. Those skilled in the art would readily appreciate 
that in a pair of genotype-specific primers, each primer 
is derived from different nucleotide domains specific for 
a given genotype. Also, it is understood by those skilled 
in the art that each pair of primers comprises one primer 

25 which is complementary to the original viral RNA and the 
other which is complementary to the first strand of cDNA 
generated by reverse transcription of the viral RNA. For 
example, , in a pair of genotype -specific primers for 
genotype 4b, one primer would have a nucleotide sequence 

30 derived from region 168-207 of SEQ ID NO:40 and the other 
primer would have a nucleotide sequence which is the 
complement of region 432-480 of SEQ ID NO: 40. One skilled 
in the art would readily recognize that such genotype- 
specific domains would also be useful in designing 

35 oligonucleotides for use as genotype-specific 
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° hybridization probes. Indeed, genotype -specific 
hybridization probes deduced from the El and core 
sequences of the present invention have been previously 
disclosed herein. 

The amplification products of PCR can be 
5 detected either directly or indirectly . In one 

embodiment, direct detection of the amplification products 
is carried out via labelling of primer pairs. Labels 
suitable for labelling the primers of the present 
invention are known to one skilled in the art and include 

10 radioactive labels, biotin, avidin, enzymes and 

fluorescent molecules. The derived labels can be 
incorporated into the primers prior to performing the 
amplification reaction. A preferred labelling procedure 
utilizes radiolabeled ATP and T4 polynucleotide kinase 

15 (Sambrook, J. et al . (1989) in "Molecular Cloning, A 

Laboratory Manual", Cold Spring Harbor Press, Plainview, 
NY) . Alternatively, the desired label can be incorporated 
into the primer extension products during the 
amplification reaction in the form of one or. more labelled 

20 dNTPs In the present invention, the labelled amplified 
PCR products can be detected by agarose gel 
electrophoresis followed by ethidum bromide staining and 
visualization under ultraviolet light or via direct 
sequencing of the PCR-products . Thus, in one embodiment, 

25 the present invention relates to a method for determining 
the genotype of a hepatitis C virus present in a mammal 
where said method comprises : amplifying RNA of a mammal 
via RT-PCR using labelled genotype -specific primers for 
the amplification step of the cDNA produced by reverse 

30 transcription. 

In yet another embodiment, unlabelled 
amplification products can be detected via hybridization 
with labelled nucleic acid probes radioactively labelled 
or, . labelled with biotin, in methods known to one skilled 

35 in the art such. as dot and slot blot hybridization 
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(Kafatos, F.C. et al . (1979) or filter hybridization 
(Hollander, M.C. et al . (1990)). 

In one embodiment, the nucleic acid sequences 
used as probes are selected from, and substantially 
homologous to, SEQ ID NOs:l-51 and/or SEQ ID NOs: 103-154. 
5 Such probes are useful as universal probes in that they 
can detect PCR- amplification products of El and/or core 
cDNAs of an HCV isolate belonging to any of the HCV 
genotypes disclosed herein. The size of these probes can 
range from about 200 to about 500 nucleotides. In an 
10 alternative embodiment, the sequence alignments shown in 
Figures 1A-1 thru 1H-5 and 6A-1 thru 6J-4 may be used to 
design oligonucleotides useful as universal hybridization 
probes. Examples of core and envelope nucleotide domains 
from which such universal oligonucleotides may be deduced 
15 are disclosed herein. 

In yet another embodiment , the present invention 
relates to a method for determining the genotype of a 
hepatitis C virus present in a mammal where said method 
comprises : 

20 < a > amplifying RNA of a mammal via RT-PCR to 

produce amplification products; 

(b) contacting said products with at least one 
genotype -specif ic oligonucleotide; and 

(c) detecting complexes of said products which 
25 bind to said oligonucleotide (s) . 

In this method, one embodiment of said 
amplification step is carried out using the universal 
primers for El or core cDNAs as disclosed above. In step 
(b) of this method, the genotype -specific sequences used 

30 as probes may be deduced from the genotype -specific El and 
core nucleotide domains disclosed herein. These probes 
are useful in specifically detecting PCR- amplification 
products of El or core cDNAs of HCV isolates belonging to 
one of the HCV genotypes disclosed herein. In a preferred 

35 embodiment, these probes are used alone or in combination 
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° with other probes specific to the same genotype. 

For example, a probe having a sequence according 
to SEQ ID NO: 213 can be used alone or in combination with 
a probe having a sequence according to SEQ ID NO: 2 14. The 
probes used in this method can range in size from about 15 
5 to about 100 nucleotides with a more preferred range being 
about 30 to about 70 nucleotides. Such probes can be 
synthesized as described earlier. 

In an alternative embodiment, the genotype of 
the amplification product of step (a) may be determined by 

10 using the nucleic acid sequences shown in SEQ ID NOs : 1- 
51 and 103-154 as probes (Delwart, E. et al . (1993)) 
Science , 262: 1257-1261). Probes utilized in the method 
of Delwart et al . may range in size from about 100 to 
about 1,000 nucleotides with a more preferred probe size 

15 being about 20 0 to about 8 00 base pairs and a most 
preferred probe size being about 300 to about 700 
nucleotides. 

The nucleic acid sequence used as a probe to 
detect PGR amplification products of the present invention 

20 can be labeled in single -stranded or double -stranded form. 
Labelling of the nucleic acid sequence can be carried out 
by techniques known to one skilled in the art. Such 
labelling techniques can include radiolabels and enzymes 
(Sambrook, J. et al . (1-989) in "Molecular Cloning, A 

25 Laboratory Manual", Cold Spring Harbor Press, Plainview, 
New York) . In addition, there are known non- radioactive 
techniques for signal amplification including methods for 
attaching chemical moieties to pyrimidine and purine rings 
(Dale, R.N. K. et al . (1973) Proc. Natl. Acad. Sci. . 

30 70:2238-2242; Heck, R.F. (1968) S . Am . Chem . Soc . . 
90:5518-5523), methods which allow detection by 
chemi luminescence (Barton, S.K. et al . (1992) J. Am. Chem. 
Soc . , 114:8736-8740) and methods utilizing biotinylated 
nucleic acid probes (Johnson, T.K. et al. (1983) Anal . 

35 Biochem. . 133:126-131; Erickson, P.F. et al . (1982) J. of 
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Immunology Methods. 51:241-249; Matthaei, F.S. et al . 

(1986) Anal . Biocheni. 157:123-128) and methods which 
allow detection by fluorescence using commercially 
available products. 

The present invention also relates to computer 
analysis of the amino acid sequences shown in SEQ ID 
NOs: 52-102 by the program GENALIGN. This analysis groups 
the 51 amino acid sequences shown in SEQ ID NOs : 52-102 
into twelve genotypes based upon the degree of variation 
of the amino acid sequences. For the purposes of the 
present invention, the amino acid sequence identity of El 
amino acid sequences of the same genotype ranges from 
about 85% to about 100% whereas the identity of El amino 
acid sequences of different genotypes ranges from about 
45% to about 80%. 

The grouping of SEQ ID NOs: 52-102 into twelve 
HCV genotypes is shown below: 



20 



25 



SEO ID NOs; 


Genotvoes 


52-59 


I/la 


60-76 


Il/lb 


77-80 


III/2a 


81-84 


IV/2b 


85 


2c 


86-90 


V/3a 


91 


4a 


92 


4b 


93-94 


4c 


95 


4d 


96-101 


5a 


102 


6a 


For those genotypes 


containing more than one El 



amino acid sequence, computer alignment of the constituent 
30 sequences of each genotype was conducted using the 

computer program GENALIGN in order to produce a consensus 
sequence for each genotype. These alignments and their 
resultant consensus sequences are shown in Figures 2A-1 
thru 2G-2 for the seven genotypes (I/la, Il/lb, IIl/2a, 
35 IV/2b, V/3a, 4c and 5a) which comprise more than one : 
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sequence.. Further alignment of the consensus sequences 
shown in Figures 2A-1 thru 2G-2 with the amino acid 
sequences of SEQ ID NO : 85 . (genotype 2c) ; SEQ ID NO:91 
(genotype 4a); SEQ ID NO:92 (genotype 4b); SEQ ID NO:95 
(genotype 4d) and SEQ ID NO: 102 (genotype 6a) to produce a 
consensus amino acid sequence for all twelve genotypes is 
shown in Figures 2H-1 and 2H-2. The multiple alignment of 
El amino acid sequences shown in Figures 2A-1 thru 2H-2 
produces consensus sequences which serve to highlight 
regions of homology and non- homology between El amino acid 
sequences of the same genotype and of different genotypes 
and hence, these alignments can readily be used by those 
skilled in the art to design peptides useful in assays and 
vaccines for the diagnosis and prevention of HCV 
infection. 

In another embodiment, the computer analysis of 
SEQ ID NOS: 155-206 by the probe genome results in 
distribution of the 52 HCV core sequences into 14 
genotypes based upon identification of genotype -specific 
amino acid sequences . 

The grouping of SEQ ID NOS: 155-206 into 14 HCV 
genotypes is shown below: 



25 



SEO ID NOS: 


Genotvoes 


155-160 


I/la 


161-176 


Il/lb 


177-180 


III/2a 


181-185 


IV/2b 


186 


2c 


187-190 


V/3a 


191 


4a 


193 


4b 


195 


4c 


196 


4c 


197 


4d 


194 


4e 


192 . 


4f 


198-205 


5a 


206 


6a 



These fourteen genotypes can be further grouped 
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into six major genotypes designated genotypes 1-6 as 
described earlier for the core nucleotide sequences of the 
present application. Computer alignment of the amino acid 
sequences disclosed in SEQ ID NOS: 155-206 are shown in 
Figures 7A-1 thru 7J-1. As with the multiple alignments 
5 of the E-l amino acid sequences, the consensus sequences 

shown in Figures 7A-1 thru 7J-1 serve to highlight regions 
of homology and nonhomology between core amino acid 
sequences of the same genotype and of different genotypes 
and hence, these alignments can readily be used by those 
10 skilled in the art to design peptides useful in assays and 
vaccines for the diagnosis and prevention of HCV 
infection. 

Examples of purified and isolated peptides 
deduced from the alignments shown in Figures 2A-1 thru 2H- 
15 2 include, but are not limited to, SEQ ID NOs : 240-263 

wherein these peptides are derived from two regions of the 
amino acid sequences shown in Figures 2A-1 thru 2H-2, 
amino acids 48-80 and amino acids 138-160. The peptides 
shown in SEQ ID NOs . 240-263 are useful as genotype- 

20 specific diagnostic reagents since they are capable of 
detecting an immune response specific to HCV isolates 
belonging to a single genotype. The genotype-specificity 
of the peptides shown in SEQ ID NOs: 24 0-263 are as 
follows: SEQ ID NOs: 24 0 and 252 are specific for genotype 

25 IV/2b; SEQ ID NOs : 241 and 253 are specific for genotype 
2c; SEQ ID NOs : 242 and 254 are specific for genotype 
III/2a; SEQ ID NOs : 243 and 255 are specific for genotype 
V/a; SEQ ID NOs: 244 and 256 are specific for genotype 
Il/lb; SEQ ID NOs : 245 and 257 are specific for genotype 

30 I/la; SEQ ID NOs : 246 and 258 are specific for genotype 4a; 
SEQ ID NOs: 24 7 and 25 9 are specific for genotype 4c ; SEQ 
ID NOs: 248 and 260 are specific for genotype 4d; SEQ ID 
NOs: 249 and 261 are specific for genotype 4b; SEQ ID 
NOs: 250 and 262 are specific for genotype 5a and SEQ ID 

35 NOs:251 and 263 are specific for genotype 6a. In SEQ ID 
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° NO:240, Xaa at position 22 is a residue of Ala or Thr, Xaa 
at position 24 is a residue of Val or lie, Xaa at position 
26 is a residue of Val or Met; in SEQ ID NO: 242, Xaa at 
position 5 is a Ser or Thr residue, Xaa at position 11 is 
an Arg or Gin residue, Xaa at position 12 is an Arg or Gin 
5 residue; in SEQ ID NO:243, Xaa at position 3 is a Pro or 
Ser residue, Xaa at position 33 is a Leu or Met residue; 
in SEQ ID NO:244, Xaa at position 5 is a Thr or Ala 
residue, Xaa at position 13 is a Gly, Ala, Ser, Val or Thr 
residue, Xaa at position 14 is a Ser, Thr or Asn residue, 

10 Xaa at position 15 is a Val or lie residue Xaa at 

position 16 is a Pro or Ser residue, Xaa at position 18 is 
a Thr or Lys residue, Xaa at position 19 is a Thr or Ala 
residue, Xaa at position 22 is an Arg or His residue, Xaa 
at position 32 is an Ala, Val or Thr residue; in SEQ ID 

15 NO: 24 5, Xaa at position 3 is an Ala or Pro residue, Xaa at 
position 4 is a Val or Met residue, Xaa at position 5 is a 
Thr or Ala residue, Xaa at position 17 is a Thr or Ala 
residue, Xaa at position 18 is a Thr or Ala residue, Xaa 
at position 23 is a His or Tyr residue; in SEQ ID NO: 247, 

20 Xaa at position 10 is a Val or Ala residue, Xaa at 

position 11 is a Ser or Pro residue, Xaa at position 18 is 
an Asp or Glu residue Xaa at position 20 is a Leu or lie 
residue; in SEQ ID NO: 250, Xaa at position 3 is a Gin or 
His residue, Xaa at position 12 is an Asn, Ser or Thr 

25 residue, Xaa at position 13 is a Leu or Phe residue, Xaa 

at position 23 is an Ala or Val residue; in SEQ ID NO: 252, 
Xaa at position 16 is a Val or Ala residue, Xaa at 
position 18 is a Glu or Gin residue; in SEQ ID NO: 254, Xaa 
at position 2 is an Ala or Thr residue, Xaa at position 4 

30 is a Met or Leu residue, Xaa at position 9 is an Ala or 

Val residue, Xaa at position 17 : is an lie or Leu residue, 
Xaa at position 20 is an lie or Val residue, Xaa at 
position 21 is a Ser or Gly residue; in SEQ ID NO: 151, Xaa 
at position 9 is a Val or lie residue, Xaa at position 16 

35 is a Leu or Val residue, Xaa at position 20 is an lie or 
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Leu residue; in SEQ ID NO: 256, Xaa at position 2 is an Ala 
or Thr residue, Xaa at position 6 is a Val or Leu residue, 
Xaa at position 12 is an lie or Leu residue, Xaa at 
position 16 is a Val or lie residue, Xaa at position 17 is 
a Val, Leu or Met residue, Xaa at position 19 is a Met or 
Val residue, Xaa at position 21 is an Ala or Thr residue; 
in SEQ ID NO: 257, Xaa at position 2 is a Thr or Ala 
residue, Xaa at position 6 is a Val, lie or Met residue, 
Xaa at position 12 is an lie or Val residue, Xaa at 
position 16 is a lie or Val residue; in SEQ ID NO:155, Xaa 
at position 5 is a Leu or Val residue, Xaa at position 21 
is a Thr or Ala residue; in SEQ ID NO:262, Xaa at position 

1 is a Thr or Ala residue, Xaa at position 5 is a Val or 
Leu residue, Xaa at position 9 is a Leu, Met or Val 
residue, Xaa at position 23 is a Gly or Ala residue. 

15 Examples of core amino acid domains from which 

genotype-specific peptides may be deduced, include but are 
not limited to, those shown below where the sequence in 
which the indicated domains are found is given in 
parentheses to the right of each genotype: 

20 Amino Acid 

Genotype Domains 

la (consensus sequence of Figures 7A-1 and 7A-2) 67-78 
lb (consensus sequence of Figures 7B-1 and 7B-2) 67-78 

2 (consensus sequence of Figures 7F-1 and 7F-2) 66-81 

110-119 

25 2a (consensus sequence of Figure 7D-1) 67-78 

, 115-125 
2b (consensus sequence of Figure 7E-1) 67-78 

2c (SEQ ID NO: 186) 12 67-78 

75-81 

• . 184-191 

3a (consensus sequence of Figure 7G-1) 8-22 

32-46 
67-78 
158-170 

, 180-191 
4 (consensus sequence of Figures 7H-1 and 7H-2) 14-23 
4a (SEQ ID NO: 191) 67-78 
4b (SEQ ID NO: 193) 45-57 

35 4C (SEQ ID NO: 195) 67-78 
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4d (SEQ ID NO:197) 67-78 
4e (SEQ ID NO: 194) 67-78 
4f (SEQ ID NO:192) 67-78 
5a (consensus sequence of Figure 7J-1) 67-78 
6a (SEQ ID NO:206) 67-78 

101-108 
144-155 

5 157-163 

Those skilled in the art would be aware that the 
peptides of the present invention or analogs thereof can 
be synthesized by automated instruments sold by a variety 
of manufacturers or can be commercially custom- ordered and 

10 prepared. The term analog has been described earlier in 
the specification and for purposes of describing the 
peptides of the present invention, analogs can further 
include branched, cyclic or other non- linear arrangements 
of the peptide sequences of the present invention. 

15 Alternatively, peptides can be expressed from 

nucleic acid sequences where such sequences can be DNA, 
cDNA, ENA or any variant thereof which is capable of 
directing protein synthesis. In one embodiment, 
restriction digest fragments containing a coding sequence 

20 for a peptide can be inserted into a suitable expression 
vector that functions in prokaryotic or eukaryotic cells. 
Such restriction digest fragments may be obtained from 
clones isolated from prokaryotic or eukaryotic sources 
which encode the peptide sequence. 

25 Suitable expression vectors and methods of 

isolating clones encoding the peptide sequences of the 
present invention have previously been described. In yet 
another embodiment, an oligonucleotide capable of 
directing host organism synthesis of the given peptide may 

30 be synthesized and inserted into the expression vector. 

The preferred size of the peptides of the 
present invention is from about 8 to about 100 amino acids 
in length when the peptides are chemically synthesized 
with a more preferred size being about 8 to about 3 0 amino 

35 acids and a most preferred size being about 10 to about 20 
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amino acids in length. For recombinant ly expressed 
peptides, the size may range from about 20 to about 190 
amino acids in length with a more preferred size being 
about 7 0 amino acids. 

The present invention further relates to the use 
of genotype-specific peptides in methods of detecting 
antibodies against a specific genotype of HCV in 
biological samples. In one embodiment, at least one 
genotype-specific peptide deduced from a genotype-specific 
core or El amino acid domain may be used in any of 
immunoassays described herein to detect antibodies 
specific for a single genotype of HCV. In another 
embodiment, at least one genotype -specific peptide deduced 
from a genotype-specific core nucleotide domain and at 
least one genotype- specif ic peptide deduced from an El 
amino acid domain may be used in an immunoassay to detect 
antibodies against a single genotype of HCV. A preferred 
immunoassay is ELISA. 

It is understood by those skilled in the art 
that the diagnostic assays described herein using 
genotype -specific oligonucleotides or genotype -specific 
peptides can be useful in assisting one skilled in the art 
to choose a course of therapy for the HCV- infected 
individual . 

In an alternative embodiment, a mixture of 
genotype -specific peptides can be used in an immunoassay 
to detect antibodies against multiple genotypes of HCV 
disclosed herein. For example, a mixture of genotype - 
specific peptides deduced from El amino acid sequences may 
comprise at least one peptide selected from SEQ ID 
NOs:244-245 and 256-257; one peptide selected from SEQ ID 
NOs:24 0, 242, 252 and 254; one peptide selected from SEQ 
ID NOs:246-249 and 258-261; one peptide selected from SEQ 
ID N0s:250 and 262; one peptide selected from SEQ ID 
NOs:243 and 255; one peptide selected from SEQ ID NOs:242 
35 and 254 and one peptide selected from SEQ ID NOs:244 and 
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263. In a preferred embodiment, the peptides of the 
present invention can be used in an ELISA assay as 
described previously for recombinant El and core proteins . 

In an alternative embodiment, the peptide (s) 
utilized in an immunoassay to detect all the genotypes of 
5 HCV disclosed herein may be a universal peptide deduced 

from universally conserved amino acid domains of the El or 
core proteins disclosed herein. 

Examples of universally conserved core amino 
. acid domains within the consensus sequence shown in Figure 

10 7J-1 from which universal peptides may be deduced include, 
but are not limited to amino acid domains 23-35, 53-66, 
93-108, 122-138, 150-156, and 165-181 of the consensus 
sequence. Examples of universally conserved El amino acid 
domains within the HCV El protein are located within the 

15 consensus sequence for the 51 HCV El proteins shown in 
Figures 2H-1 and 2H-2 of the present application. 
Examples of universally conserved domains within the 
consensus sequence shown in Figures 2H-1 and 2H-2 include, 
but are not limited to, amino acid domains 10-20, 111^120, 

20 and 124-137 of the consensus sequence. The universal 
peptides of the present invention may be used in an 
immunoassay to detect antibodies in patient sera specific 
for any of the genotypes of HCV disclosed herein. 

The peptides of the present invention or analogs 

25 thereof may be prepared in the form of a kit, alone or in 
combinations with other reagents such as secondary 
antibodies, for use in immunoassay. 

In another embodiment, the genotype- specific and 
universal peptides of the present invention may be used to 

30 produce antibodies that will react against HCV El or core 
proteins in immunoassays. In one embodiment, a genotype- 
specific El or core peptide can be used alone or in 
combination with other El or core peptides specific to the 
same genotype as immunogens to produce antibodies specific 

35 to HCV proteins of a single genotype. 
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In another embodiment, a mixture of peptides 
specific for different genotypes may be used to produce 
antibodies that will react with HCV proteins of any 
genotype disclosed herein. More preferably, antibodies 
reactive with HCV proteins of any genotype may be produced 
by immunizing an animal with universal peptide (s) of the 
present invention. Examples of immunoassays in which such 
antibodies could be utilized to detect HCV El and core 
proteins in biological samples include, but are not 
limited to, radioimmunoassays and EL IS As . Examples of 
biological samples in which HCV El and core proteins could 
be detected includes, but it is not limited to, serum, 
saliva and liver. 

Of course, those skilled in the art would 
readily understand that the genotype-specific and 
universal peptides of the present invention and expression 
vectors containing nucleic acid sequence capable of 
directing host organism synthesis of these peptides could 
also be used as vaccines against hepatitis C. 
Formulations suitable for administering the peptide (s) and 
expression vectors of the present invention as immunogen, 
routes of administration, pharmaceutical compositions 
comprising the peptides expression vectors and so forth 
are the same as those previously described for recombinant 
El and core proteins. 

The genotype -specific and universal peptides of 
the present invention and expression vectors containing 
nucleic acid sequence capable of direct host organism 
synthesis of these peptides may also be supplied in the 
form of a kit, alone, or in the form of a pharmaceutical 
30 composition as described above for recombinant El and core 
proteins. 

Any articles or patents referenced herein are 
incorporated by reference. The following examples 
illustrate various aspects of the invention but are in no 
35 way intended to limit the scope thereof. 
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° MATERIALS 

Serum used in these examples was obtained from 
84 anti-HCV positive individuals who were previously found 
to be positive for HCV RNA in a cDNA PCR assay with primer 
set a from the 5' NC region of the HCV genome (Bukh, J. et 
5 al. (1992 (b) ) Proc. Natl, Acad. Sci . USA 89:4942-4946). 
These samples were from 12 countries : Denmark (DK) ; 
Dominican Republic (DR) ; Germany (D) ; Hong Kong (HK) ; 
India (IND) ; Sardinia, Italy (S) ; Peru (P) ; South Africa 
(SA) ; Sweden (SW) ; Taiwan (T) ; United States (US) ; and 
10 Zaire (Z) . 

Example 1 

Identification of the cDNA Sequence 
of the El Gene of 51 Isolates of HCV via 
RT-PCR Analysis of Viral RNA Using Universal Primers 

15 

Viral RNA was extracted from 100 /xl of serum by 
the guanidinium-phenol -chloroform method and the final RNA 
solution was divided into 10 equal aliquots and stored at 
-80°C as described (Bukh, et al . (1992 (a)). The 

20 sequences of the synthetic oligonucleotides used in the 

RT-PCR assay, deduced from the sequence of HCV strain H-77 
(Ogata, N. et al. (1991) Proc. Natl. Acad. Sci. USA 
88:3392-3396), are shown as SEQ ID NOs:207-212. One 
aliquot of the final RNA solution, equivalent to 10 /il of 

25 serum, was used for cDNA synthesis that was performed in a 
20 /xl reaction mixture using avian myeloblastosis virus 
reverse transcriptase (Promega, Madison, WI) and SEQ ID 
NO:208 as a primer. The resulting cDNA was amplified in a 
"nested" PCR assay by Tag DNA polymerase (Amplitaq, 

30 Perkin-Elmer/Cetus) as described previously (Bukh et al . 
(1992a)) with primer set e (SEQ ID NOs : 207-210) . 
Precautions were taken to avoid contamination with 
exogenous HCV nucleic acid (Bukh et al . 1992a)), and 
negative controls (normal, uninfected serum) were 

35 interspersed between every test sample in both the RNA 
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extraction and cDNA PGR procedures. No false positive 
results were observed in the analysis. In most instances, 
amplified DNA (first or second PCR products) was 
reamplified with primers SEQ ID NO:211 and SEQ ID NO:212 
prior to sequencing since these two primers contained 
EcoRl sites which would facilitate future cloning of the 
El gene. Amplified DNA was purified by gel 
electrophoresis followed by glass-milk extraction 
(Geneclean, BIO 101, LaJolla, CA) and both strands were 
sequenced directly by the dideoxynucleotide chain 
termination method (Bachman, B. et al. (1990) Nucl . Acids 
Res. 18:1309)) with phage T7 DNA polymerase (Sequenase, 
United States Biochemicals , Cleveland, OH), [alpha 3S S] dATP 
(Amersham, Arlington Heights, IL) or [alpha 33 P] dATP 
(Amersham or DuPont, Wilmington, DE) and sequencing 
15 primers. RNA extracted from serum containing HCV strain 

H-77, previously sequenced by Ogata, N. et al. (1991), was 
amplified with primer set e (SEQ ID NOs: 207-210) and 
sequenced in parallel as a control. The nucleotide 
sequences of the envelope 1 (El) gene of all 51 HCV 
20 isolates are shown as SEQ ID N0s:l - 51. In all 51 HCV 
isolates, the El gene was exactly 576 nucleotides in 
length and did not have any in- frame stop codons. 

Example 2 

25 Computer Analysis of the Nucleotide 

and Deduced Amino Acid Sequences 
of the El Rene of 51 HCV Isolates 

Multiple computer-generated alignments of the 
nucleotide (SEQ ID NOs: 1-51, Figures 1A-1 thru 1H-5) and 
deduced amino acid sequences (SEQ ID NOs: 52 -102, Figures 
2A-1 thru 2H-2) of the cDNAs of the 51 HCV isolates 
constructed using the computer program GENALIGN (Miller, 
R.H. et al. (1990) Proc . Natl. Acad. Sci. USA 87:2057- 
2061) resulted in the 51 HCV isolates being divided into 
35 twelve genotypes based upon the degree of variation of the 



30 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95V10398 



El gene sequence as shown in table 1.- 

5 

10 
15 
20 
25 
30 

35 

SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9605315*3 l*> 



WO 96/05315 



PCTAJS95/10398 



- 66 



10 



15 



20 



25 



30 



35 



5 

s 
0Q 



E 

CQ 




SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



10 



15 



- 67 - 

The nucleotide and amino acid sequence identity 
of HCV isolates of the same genotype was in the range of 
88.0-99-1% and 89.1-98.4%, respectively, whereas that of 
HCV isolates of different genotypes was in the range of 
53.5-78.6% and 49.0-82.8%, respectively. The latter 
differences are similar to those found when comparing the 
envelope gene sequences of the various serotypes of the 
related f laviviruses , as well as other RNA viruses. When 
microheterogeneity in a sequence was observed, defined as 
more than one prominent nucleotide at a specific position, 
the nucleotide that was identical to that of the HCV 
prototype (HCV1, Choo et al . (1989)) was reported if 
possible. Alternatively, the nucleotide that was identical 
to the most closely related isolate is shown. 

Analysis of the consensus sequence of the El 
protein of the 51 HCV isolates from this study demonstrated 
that a total of 60 (30.3%) of the 192 amino acids of the El 
protein were invariant among these isolates (Figures 3A and 
3B) . Most impressive, all 8 cysteine residues as well as 6 
of 8 proline residues were invariant. The most abundant 
20 amino acids (e.g. alanine, valine and leucine) showed a 

very low degree of conservation. The consensus sequence of 
the El protein contained 5 potential N- linked glycosylation 
sites. Three sites at positions 209, 305 and 325 were 
maintained in all 51 HCV isolates. A site at position 196 
was maintained in all isolates except the sole isolate of 
genotype 2c. Also, a site at position 234 was maintained 
in all isolates except one isolate of genotype I/la, all 
four isolates of genotype IV/2b and the sole isolate of 
genotype 6a. Conversely, only genotype IV/2b isolates had 
a potential glycosylation site at position 233. Further 
analysis revealed a highly conserved amino acid domain (aa 
302-328) in the El protein with 20 (74.1%) of 27 amino 
acids invariant among all 51 HCV isolates. It is possible 
that the 5' and 3' ends of this domain are conserved due to 
2j important cysteine residues and N- linked glycosylation 
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° sites. The central sequence, 5 ' - GHRMAWDMM- 3 ' (aa 315-323), 
may be conserved due to additional functional constraints 
on the protein structure. Finally, although the amino acid 
sequence surrounding the putative El protein cleavage site 
was variable/ an amino acid doublet (GV) at position 3 80 
5 was invariant among all HCV isolates. 

A dendrogram of the genetic relatedness of the El 
protein of selected HCV isolates representing the 12 
genotypes is shown in Fig. 4. This dendrogram was 
constructed using the program CLUSTAL (Higgins, D.G. et al . 
10 (1988) Gene, 73:237-244) and had a limit of 25 sequences. 
The scale showing percent identity was added based upon 
manual calculation. From the 51 HCV isolates for which the 
complete sequence of the El gene region was obtained, 2 5 
isolates representing the twelve genotypes were selected 
15 for analysis. This dendrogram in combination with the 
analysis of the El gene sequence of 51 HCV isolates in 
Table 1 demonstrates extensive heterogeneity of this 
important gene. 

The worldwide distribution of the 12 genotypes 

20 among 74 HCV isolates is depicted in Fig. 5. The complete 
El gene sequence was determined in 51 of these HCV isolates 
(SEQ ID NOs:l-51), including 8 isolates of genotype I/la, 
17 isolates of genotype I I/lb and 26 isolates comprising 
genotypes III/2a, IV/2b, 2c, 3a, 4a-4d, 5a and 6a. In the 

2 5 remaining 23 isolates, all of genotypes I/la and Il/lb, the 
genotype assignment was based on a partial El gene sequence 
since they did not represent additional genotypes in any of 
the 12 countries. The number of isolates of a particular 
genotype is given in each of the 12 countries studied. Of 

30 the twelve genotypes, genotypes I /la and II /lb were the 
most common accounting for 48 (65%) of the 74 isolates. 
Analysis of the El gene sequences available in the GenBank 
data base at the time of this study revealed that all 44 
such sequences were of genotypes I/la, Il/lb, III/2a and 

35 IV/2b. Thus, based upon El gene analysis, 8 new genotypes 
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° of HCV have been identified. 

Also of interest, different HCV genotypes were 
frequently found in the same country, with the highest 
number of genotypes (five) being detected in Denmark. Of 
the twelve genotypes, genotypes I/la, Il/lb, III/2a, IV/2b 
5 and V/3a were widely distributed with genotype Il/lb being 
identified in 11 of 12 countries studied (Zaire was the 
only exception) . In addition, while genotypes I/la and 
Il/lb were predominant in the Americas, Europe and Asia, 
several new genotypes were predominant in Africa . 

]0 It w as also found that genotypes I/la, Il/lb, 

III/2a, IV/2b and V/3a of HCV were widely distributed 
around the world, whereas genotypes 2c, 4a, 4b, 4d, 5a and 
6a were identified only in discreet geographical regions. 
For example, the majority of isolates in South Africa 

15 comprised a new genotype (5a) and all isolates in Zaire 
comprised 3 new closely related genotypes (4a, 4b, 4c) . 
These genotypes were not identified outside Africa. 

Example 3 

20 Identification of the cDNA Sequence 

Of The Core Gene Of 52 Isolates Of HCV 

Viral RNA extraction, cDNA synthesis and "nested" 
PCR were carried out as in Example 1 . For the cDNA PCR 
assay HCV- specific synthetic oligonucleotides deduced from 

25 previously determined sequences that flank the C gene were 
used. Amplified DNA was purified by gel electrophoresis 
followed by glass-milk extraction as described in Example 1 
or by electroelution and both strands were sequenced 
directly. In 44 of the 52 HCV isolates studied the 

3q procedures for direct sequencing described in Example 1 
were utilized. For a number of the HCV isolates 
confirmatory sequencing was performed with the Applied 
Biosystems 373A automated DNA sequencer and 8 HCV isolates 
of genotype I/la or Il/lb were sequenced exclusively by 

35 this method. All 73 negative control samples interspersed 
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° among the test samples were negative for HCV RNA. 

The amplified DNA fragment obtained in 50 of the 
52 HCV isolates was specifically designed to overlap with 
previously obtained 5'NC sequences (Bukh et al . (1992b) 
Proc. Natl, Aca d. Sci. U.S.A. 89:4942-4946) and with the El 
5 sequences disclosed herein at approximately 80 nucleotide 
positions each. A complete match was observed in 6033 of 
6035 overlapping nucleotides. Two discrepancies were 
observed in isolate US6 at nt 552 (C and T) and nt 561 (C 
and T) respectively. This may have been due to 

10 microheterogeneity at these nucleotide positions, since the 
remaining overlapping sequence was unique for isolate US 6 . 
In addition, there were 3 confirmed instances of 
microheterogeneity: nt 33 in isolate SA11 (C,T and T) , nt 
36 in isolate S45 (A,C and A), and nt 552 in isolate P10 

1 5 <C,T and T) . Overall, the excellent agreement in these 

overlapping sequences in this study with the NC sequences 
disclosed in Bukh et al . and with the El sequences 
disclosed herein definitively ruled out contamination as a 
source of non-authentic HCV sequences. Furthermore, this 

20 analysis proved that the sequences obtained were from a 
single population, and not from different populations as 
could happen in mixed infections. 

The core (C) gene was exactly 573 nucleotides in 
length in all 52 HCV isolates with an amino terminal start 

25 codon and no in- frame stop codons. Microheterogeneity was 
observed in 26 of the 52 HCV isolates at 0.2-1.4% of the 
573 nucleotide positions of the C gene, and resulted in 
changes in 0.5-1.0% of the 191 predicted amino acids in 12 
of these isolates. A multiple sequence alignment was 

30 performed and it showed that the nucleotide identities of 
the C gene among these HCV isolates were in the range of 
79.4-99.0%. In order to compare the genetic relatedness of 
HCV isolates in different gene regions, phylogenetic trees 
of the C gene of all 52 HCV isolates and the El gene of 51 
-35 HCV isolates were constructed using the. unweighted pair- 
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° group method with arithmetic mean (Nei, M . (1987) Molecular 
Evolutionary Genetics (Columbia University Press, New York, 
N.Y. , pp. 287-326) (Figures 8A and 8B) . In both 
dendrograms a division of the 45 HCV isolates from which C 
and El genes had been cloned into at least six major 
5 genetic groups (genotypes 1-6) and 12 minor genetic groups 
(genotypes I/la, Il/lb, III/2a, IV/2b, 2c, V/3a, 4a-4d, 5a, 
and 6a) was observed. It is noteworthy that a major 
division in genetic distance between HCV isolates of 
genotype 2 and those of the other genotypes in the 

10 phylogenetic analyses of both gene sequences was observed. 
Furthermore, the divergence of the minor genotypes within 
genotype 2 exhibited a degree of heterogeneity that is 
equivalent to that observed among the major genotypes. 
Analysis of the C gene from isolates Z5 and Z8, which had a 

j5 unique 5' NC sequence (Bukh et al . (1992)) but from which 
the El gene could not be amplified, revealed that these 
isolates represented two additional genotypes. The 
designations 4e and 4f are assigned to these genotypes that 
have not been described previously. Overall,* the present 

20 specification demonstrates that the genetic relatedness of 
HCV isolates is equivalent when analyzing the most 
conserved gene (C) and one of the most variable genes (El) 
of the HCV genome, thereby providing strong evidence for 
the suggested division into major and minor genotypes. 



25 



Example 4 

Computer Analysis of the Nucleotide and Deduced 



Amino Acid Sequences Of The Core Gene Of 52 HCV Isolates 

In order to study further the heterogeneity of 
3q . the C gene, a consensus sequence of the core gene from the 
52 HCV isolates (Figures 6J-1 thru 6J-4) was obtained. A 
total of 335 (58.5%) of the 573 nucleotides of the C gene 
were invariant among these HCV isolates. Nucleotides at 
the 1st and 2nd codon positions were invariant at 70.7% and 
25 81.7% of these positions, respectively, while nucleotides 
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° at the 3rd position were invariant at only 23.0% of such 
positions. Stretches of 6 or more invariant nucleotides 
were observed from nucleotides 1-8, 22-27, 85-92, 110-125, 
131-141, 334-340, 364-371, 397-404, and 511-516 and may be 
suitable for anchoring primers for amplification of HCV RNA 
5 in cDNA PCR assays. 

Genotype- specific nucleotide positions of the 
core gene of hepatitis C virus were also noted for each of 
the genotypes. These genotype -specific nucleotides are 
shown below where each genotype -specific nucleotide is 
10 given in parentheses next to the nucleotide position in 
which it is found. 

Genotype 1 : 460 (C) , 466 (C) , 483 (C) , 486 (G) . 



15 



20 



25 



Genotype I /la: 180 (T) . 

Genotyp e I I /lb: 106 (C) , 273 (G) . 

genotype 2; 192 (C) , 201 (A), 203 (A), 207 (G) , 210 (C) , 
221 (A), 231 (A), 232 (A), -341(A). 

Genotype III /2a: 315 (C) , 355 (G) . 

Genotype JV/2b; 45 (A), 174 (G) , 216 (C) , 348 (A), 376 (A), 
414 (T) . 

genotype 2c; 233 (G) , 312 (C) , 318 (A), 456 (C) , 462 (G) , 
543 (C) , 556 (T) . 



genotype y/3a: 47 (T) , 84 (A) , 106 (G) , 126 (A) , 150 (T) , 
30 212 (G), 216 (A), 300 (A), 491 (T) , 559 (C) , 560 (A), 568 
(G) , 571 (A) , 572 (G) 

Genotype 4: 59 (T) . 
35 Genotype 4a; 213 (A), 231 (G) , 415 (A) . 
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• Genotype 4b: 66 (G) , 145 (G) , 310 (A) . 

Genotype 4c : 213 (T) , 219 (A), 270 (T) . 

Genotype 4d: 212 (T) , 327 (G) , 469 (C) . 

5 

Genotype 4e: 199 (C) , 306 (A) , 326 (A) . 

Genotype 4f : 57 (T) , 75 (A) , 267 (A). 

]0 Genotype 5a : 291 (G) , 294 (C) . 

Genotype 6a: 5 9 (C) , 175 (A), 195 (A), 198 (A) , 214 (C) , 
224 (A) , 316 (C) , 351 (G) , 387 (G) , 444-447 (GGCT) , 450 
(G) , 471-472 (AA) , 474 (C) . 

15 These genotype-specific nucleotides are of 

utility in designing the genotype -specific PCR primers and 
hybridization probes. 

Finally, although the full length nucleic acid 
sequence of the C gene of isolates representing genotypes 

20 I/la, Il/lb, III/2a, IV/2b and V/3a have been reported by 

others, those of 9 of the 14 genotypes (i.e., 2c, 4a-4f, 5a 
and 6a) have not been reported previously. In sum, by 
aligning the consensus sequences of the major genotypes, 
the present application enables those skilled in the art to 

25 map universally conserved sequences as well as genotype - 

specific sequences of the C gene among 14 genotypes of HCV. 

In order to study the heterogeneity of the 
deduced C protein, a multiple sequence alignment of the 
predicted amino acids for all 52 HCV isolates was 

30 performed, and a consensus sequence was obtained (Figure 
7J-1) . The identities of the predicted 191 amino acids of 
the C protein among these HCV isolates were in the range of 
85. 3-100,0%. A total of 132 (69.1%) of the 191 amino acids 
of the C protein were invariant . The most prevalent amino 

^ acids in the consensus sequence were glycine (13.6%), 
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o arginine (12.6%), proline (11.0%), and leucine (9.9%). The 
most conserved amino acids were tryptophan (5 of 5 amino 
acids invariant) , aspartic acid (5 of 5 amino acids 
invariant) , proline (19 of 21 amino acids invariant) and 
glycine (23 of 2 6 amino acids invariant) . Previous 
5 analyses indicated that HCV is evolutionarily related to 
pestiviruses (Miller et al . (1990) Proc. Natl. Acad. Sni . 
U.S.A. 87:2057-2061). In this regard, it is of interest to 
note that the C proteins of both viruses have a high 
content of proline residues (Collette M.S. et al . (1988) 
10 Virology 165:200-208), which are likely to be important in 
maintaining the structure of this protein. As is 
characteristic for a protein that binds to nucleic acid, 
the C protein has conserved amino acids that are basic and 
positively charged, and these are capable of neutralizing 
l5 the negative charge of the HCV RNA encapsidated by this 
protein (Rice, CM. et al . (1986) in Togaviridae and 
Flaviviridae, eds Schleinger, S. & Schlensinger , M.J. 
(Plenum Press, New York, N.Y.) pp. 279-326). Specifically, 
over 16% of the amino acids in the consensus sequence of 

20 the C protein of HCV are arginine and lysine that are 
located primarily in three clusters (i.e., from amino 
acids 6-23, 39-74 and 101-121) (Shih, CM. et al . (1993) J. 
Gen. Virol. 67:5823-5832) (Figure 7J-1) . The 10 arginine 
and lysine residues within amino acids 3 9-62 are invariant 

25 among all 52 HCV isolates, suggesting that this domain may 
represent an important RNA-binding site. The capsid 
proteins of the related flavi-and pestiviruses (Miller et 
al. (1990)) also have a high content of arginine and lysine 
(Rice et al . (1986) ; Collette et al . (1988). Although 

30 there are three major hydrophilic regions (i.e., amino 

acids 2-23, 39-74 and 101-121) that are conserved in all 52 . 
HCV isolates, the remainder of the C protein is 
hydrophobic. Interestingly, one such highly conserved 
hydrophobic domain from aa 24-3 9 is flanked by proline 

35 residues. The hydrophobic domains are likely to be 
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° involved in protein-protein and/or protein-RNA interactions 
during assembly of the nucleocapsid, as well as in 
interaction with the lipoprotein envelope, as has been 
suggested for flaviviruses (Rice et al . (1986)). Other 
significant observations are: (i) a cluster of 5 invariant 
5 tryptophan residues from aa 76-107; (ii) the lack of an re- 
linked glycosylation site (N-X-T/S) ; (iii) two potential 
nuclear localization signals (i.e., PRRGPR at amino acids 
38-43 and PRGRRQP at amino acids 58-64) that are present in 
all 52 HCV isolates (Shih et al . (1993)); and (iW a 

10 putative DNA-binding motif SPRG at amino acids 99-102, 

found in 51 of the 52 HCV isolates, with SP present in all 
.- 52 isolates. This study demonstrates that the C protein 
has features that are highly conserved among the various 
genotypes of HCV, and that are known to be characteristic 

15 of capsid proteins of other related viruses. 

It should also be noted that the* phylogenetic 
analysis of the amino acid sequence of the C proteins was 
not capable of resolving the minor groups within genotypes 
1 and 4 because of the conservation of this protein (data 

2Q not shown) . Indeed, only a few type-specific amino acids 
were identified. One. striking example was that isolates of 
genotype 4 have an additional methionine at position 20 
that is specific for this major genetic group. Finally, 
the conservation of the sequences surrounding the cleavage 

25 site between the C and the £1 proteins of the different 
genotypes, which has been determined to be between amino 
acid 191 (alanine) and aa 192 (tyrosine) in HCV isolates of 
genotype 1 was analyzed (Hijikata, M . , et al . (1991) Proc . 
Natl. Acad. Sci . USA 88:5547-5551). The C-terminal 

2q sequence of C is serine -alanine in all but one of the 48 
HCV isolates comprising genotypes 1, 2, 4, 5 and 6. 
However, all 4 HCV isolates of genotype 3 in this study, as 
well as isolates of genotype 3 published previously 
(Okamoto, H. , et al. (1993) J. Gen. Virol. 74:2385-2390, 

25 Stuyver, L. , et al . (1993) Biochem. Biophys. Res. Comm. 
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° 192:635-641), contain alanine-serine at this position. 

Thus, studies will be needed to determine the C/El cleavage 
site in genotype 3 isolates. Overall, the present 
invention application discloses the mapping of universally 
conserved sequences, as well as genotype -specif ic 

5 sequences, of the C protein among 14 genotypes of HCV. 

Implications of the mapping of universally 
conserved and genotype-specific core nucleotide 
and amino acid core sequences for diagnosis of 
HCV infection and for determination of HCV 
genotypes m 

10 Detection of antibodies directed against the HCV 

core protein is important in the diagnosis of HCV 
infection. The recombinant C22-3 protein, spanning amino 
acids 2-120 of the C gene, is a major component of the 
commercially available second-generation anti-HCV tests. 
15 Several studies have indicated that the three major 
hydrophilic regions of the C protein contain linear 
immunogenic epitopes (summarized in J. Clin. Microbiol . 
30:1989-1994) (Sallberg, M. et al . (1992). For example, 
antibodies against synthetic peptides from amino acids 1- 

20 18/ 51-68 and 101-118 were detected in infected patients 
(Sallberg, M. et al . (1992)). The present application 
demonstrates that, while these immunogenic regions are 
highly conserved, genotype-specific differences are 
observed at several amino acid positions that may influence 

25 the specificity and sensitivity of the serological tests. 
One such example is that a single amino acid substitution 
at amino acid 110 has been demonstrated to affect sero- 
reactivity (Sallberg, et al . (1992)). Despite the high 
degree of conservation in the immunodominant regions of the 

30 c Protein among the different genotypes, it is possible 

that genetic heterogeneity of the C protein could lead to 
false negative results in current serological tests. 

With respect to genotype analysis, several 
methods have been used to determine the genotype of HCV 

35 isolates without resorting to sequence analysis . These 
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° include PCR followed by: (i) amplification with type- 
specific primers (Okamoto, H. et al . (1992) J, Gen. Virol . . 
73:673-679); (ii) determination of restriction-length 
polymorphism (Simmons, P. et al . (1993) J . Gen . Virol . . 
74:661-668); and (iii) specific hybridization (Stuyver, L. 
5 (1993) J. Gen. Virol . , 74:1093-1102), The proposed methods 
have primarily been based on 5' NC and C sequences. 
Previous studies suggested that 5' NC-based genotyping 
systems would only be predictive of the major genetic 
groups of HCV (Bukh f J., et al. (1992) Proc. Natl. Acad. 

10 Sci. USA 89:4942- 4946, Bukh, J., et al . (1993) Proc . 

Natl. Acad. Sci. USA 90:8234-8238). The most widely used C- 
based genotype system has been the PCR assay with type- 
specific primers that was designed for dist inguishing HCV 
isolates of genotypes I/la, Il/lb, III/2a, IV/2b and V/3a 

l 5 (Okamoto, H., et al. (1993) J. Gen. Virol. 74:2385-2390, 
Okamoto, H. et al . (1992) J. Gen. Virol. 73:673-679). 
Since this system was developed prior to the identification 
of genotypes 2c, 4a-4f, 5a and 6a there are significant 
limitations to this typing system. For example, the 

20 primers specific for genotype IV/2b (nt 270-251) are as 

highly conserved within isolates of genotype 4c and 6a as 
within the isolates of genotype IV/2b. Thus, this assay 
probably can not distinguish among these genotypes. Another 
C-based approach involves distinguishing between genotypes 

25 1 and 2 by type-specific antibody responses (Machida et al 
(1992) He potolocrv . 16:886-891). Synthetic peptides 
composed of amino acids 65-81 were found to be genotype- 
specific for genotypes 1 and 2 in ELISA assays. The 
present analysis of amino acid sequences demonstrated 

2Q significant variation within isolates of genotypes 1 and 2. 
Thus it is likely that these peptides . will not identify all 
isolates of genotypes 1 and 2. Furthermore, the peptide 
for genotype 1 was highly conserved within isolates of 
genotypes 3 and 4 and might detect antibodies against these 

25 genotypes as well . Finally, it should be pointed out that 
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most isolates of genotypes 3 and 4 had an identical amino 
acid sequence at positions 65-81. 



Example 5 

Detection by EL ISA Based on Antigen from 
5 Insect Cells Expressing Complete El Or Core Protein 

Expression of El or Core protein in SF9 cells . A 
cDNA (ea SEQ ID NO:l) encoding a complete El protein (eg 
SEQ ID NO:52) or a cDNA (eg SEQ ID NO:103) encoding a 
complete core protein ( e.g. SEQ ID NO: 155) is subcloned 
10 into pBlueBac - Transfer vector (Invitrogen) using standard 
subcloning procedures. The resultant recombinant 
expression vector is cotransf ected into SF9 insect cells 
(Invitrogen) by the Ca precipitation method according to 
the Invitrogen protocol. 
15 ELISA Based on Infected SF9 cells , 5 x 10 6 SF9 

cells infected with the above -described recombinant 
expression vector are resuspended in 1 ml of 10 mM Tris- 
HC1, pH 7.5, 0.15M NaCl and are then frozen and thawed 3 
times. 10 ul of this suspension . is dissolved in 10 ml of 

20 carbonate buffer (pH 9.6) and used to cover one flexible 

microtiter assay plate (Falcon) . Serum samples are diluted 
1:20, 1:400 and 1:8000, or 1:100, 1:1000 and 1:10000. 
Blocking and washing solutions for use in the ELISA assay 
are PBS containing 10% fetal calf serum and 0.5% gelatin 

25 (blocking solution) and PBS with 0.05% Tween -20 (Sigma, 

St. Louis, MO) (washing solution) . As a secondary antibody, 
peroxidase -conjugated goat IgG fraction to human IgG or 
horse radish peroxidase- labelled goat anti-Old or anti-New 
World monkey immunoglobulin is used. The results are 

3Q determined by measuring the optical density (O.D.) at 405 
nm. 

To determine if insect cells-derived El or core 
protein representing genotype I /a of HCV could detect anti- 
HCV antibody in chimpanzees infected with genotype I /la of 
35 HCV, three infected chimpanzees are examined. The serum of 
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° all 3 chimpanzees are found to seroconvert to ant i -HCV. 

Example 6 

Use of the Complete 
El Protein as a Vaccine 

5 Mammals are immunized with purified or partially 

purified El protein in an amount sufficient to stimulate 
the production of protective antibodies. The immunized 
mammals challenged with various genotypes of HCV are 
protected. 

jq It is understood by one skilled in the art that 

the recombinant El protein used in the above vaccine can 
also be used in combination with other recombinant El 
proteins having an amino acid sequence shown in SEQ ID 
NOs: 52-102. In addition, recombinant core proteins having 
15 an amino acid sequence shown in SEQ ID NOs : 155-206 could 
also be used in the above vaccine, either alone, in 
combination with other recombinant core proteins of the 
present invention, or in combination with recombinant El 
proteins having an amino acid sequence shown in SEQ ID 
NOs:52-102. 



20 



25 



30 



35 



Example 7 

Determination of the Genotype of an HCV 
Isolate Via Hybridization of Genotype -Specific 
Oligonucleotides to RT-PCR Amplification Products. 

Viral RNA is isolated from serum obtained from a 
mammal and is subjected to RT-PCR as in Example 1 or 
Example 3. Following amplification, the amplified DNA is 
purified as described in Example 1 or Example 3 and 
aliquots of 100 ul of amplification product are applied to 
dots on a nitrocellulose filter set in a dot blot 
apparatus. The dots are then cut into separate dots and 
each dot is hybridized to a 32 P- labelled oligonucleotide 
specific for a single genotype of HCV. The 
oligonucleotides to be used as hybridization probes are 
deduced from the consensus sequences shown in Figures 1A-1 
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thru 1H-5 or 6A-1 thru 6J-4 or from the SEQ ID NOs : 
representing El or core sequences comprising genotypes 4a- 
4f, 2c and 6a. 



Example 8 

5 EL ISA Based on Synthetic 

Peptides De rived From El cDNA Sequences 

El peptide (s) specific for genotype I/la is 
placed in 0.1% PBS buffer and 50ul of a lmg/ml solution of 
peptide is used to cover each well of the microtiter assay 

10 plate. Serum samples from two mammals infected with 
genotype I/la HCV and from one mammal infected with 
genotype 5a HCV are diluted as in Example 3 and the ELISA 
is carried out as in Example 3. Both mammals infected with 
genotype I HCV react positively with peptides while the 

15 mammal infected with genotype 5a HCV exhibits no 
reactivity. One skilled in the art would readily 
understand that in the above experiment, core peptides 
specific for genotype I/la could be used in place of, or in 
combination with the El genotype-specific peptide (s). 
20 Example 9 

Use of El Peptides as a Vaccine 

Since the El genotype -specific peptides of the 
present invention are derived from two variable regions in 
the complete El protein, there exists support for the use 
of these peptides as a vaccine to protect against a variety 
of HCV genotypes. Mammals are immunized with peptide (s) 
selected from SEQ ID NOs : 136-159 in an amount sufficient 
to stimulate production of protective antibodies. The 
immunized mammals challenged with various genotypes of HCV 
30 are protected. One skilled in the art would readily 

understand that genotype- specific core peptides of the 
present invention could also be used either alone, in 
combination with each other, or in combination with the 
genotype-specific El peptides, as a vaccine to protect 
35 against a variety of HCV genotypes. In addition, the above 
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vaccines may also be formulated using the universal core 
and/or El peptides of the present invention. 
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° SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANTS: BUKH, J., MILLER, R.H. AND 

PURCELL, R.H. 

5 , . . . 

(11) TITLE OF INVENTION: NUCLEOTIDE AND DEDUCED 

AMINO ACID SEQUENCES OF THE ENVELOPE 1 AND 
CORE GENES OF ISOLATES OF HEPATITIS C VIRUS 
AND THE USE OF REAGENTS DERIVED FROM THESE 
SEQUENCES IN DIAGNOSTIC METHODS AND VACCINES 

(iii) NUMBER OF SEQUENCES: 263 

10 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: MORGAN & FINNEGAN, L.L.P. 

(B) STREET: 345 PARK AVENUE 

(C) CITY: NEW YORK 

(D) STATE: NEW YORK 

(E) COUNTRY: USA 
]5 (F) ZIP: 10154 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: FLOPPY DISK 

(B) COMPUTER: IBM PC COMPATIBLE 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: WORDPERFECT 5.1 

20 (vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: TO BE ASSIGNED 

(B) FILING DATE: 15-AUG-1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NO: 08/086,428 
25 <B) FILING DATE: 29 JUNE 1993 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/290/665 

(B) FILING DATE: 15 AUGUST 1994 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: RICHARD W. BORK 
30 (B) REGISTRATION NUMBER: 36,459 

(C) REFERENCE/DOCKET NUMBER: 2026-4116 

(viii) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (212) 758-4800 

(B) TELEFAX: (212) 751-6849 

(C) TELEX: 421792 

35 (2) INFORMATION FOR SEQ ID NO:l: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
(C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



TAC CAA GTG CGC AAC TCC ACG GGG CTT TAC CAT GTC ACC 39 

10 AAT GAT TGC CCT AAC TCG AGT ATC GTG TAC GAG GCG GCC 78 

GAT GCC ATC CTG CAC ACT CCG GGG TGT GTC CCT TGC GTT 117 

CGC GAG GGT AAC GTC TCG AGG TGT TGG GTG GCG ATG ACC 156 

CCC ACG GTG GCC ACC AGG GAT GGC AAA CTC CCC ACA GCG 195 

CAG CTT CGA CGT CAC ATC GAT CTG CTC GTC GGG AGT GCC 234 

ACC CTC TGT TCG GCC CTC TAC GTG GGG GAC CTG TGC GGG 273 

TCT GTC TTT CTT GTC GGT CAA CTG TTT ACC TTC TCT CCC 312 

AGG CGC CAC TGG ACG ACG CAA GGC TGC AAT TGT TCT ATC 351 

15 TAT CCT GGC CAT ATA ACG GGT CAC CGC ATG GCG TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACC ACG GCG TTG GTA GTA 429 

GCT CAG CTG CTC CGG ATC CCG CAA GCC ATC TTG GAC ATG 468 

ATC GCT GGT GCT CAC TGG GGA GTC CTG GCG GGC ATA GCG 507 

TAT TTT TCC ATG GTG GGG AAC TGG GCG AAG GTC CTG GTA 546 

GTG CTG CTG CTA TTT GCC GGC GTC GAC GCG 576 



(2) INFORMATION FOR SEQ ID NO: 2: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: . nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 



30 TAC CAA GTA CGC AAC TCC TCG GGC CTC TAC CAT GTC ACC 39 

AAT GAT TGC CCT AAC TCG AGT ATT GTG TAC GAG GCG GCC 78 

GAT GCC ATC CTG CAT TCT CCA GGG TGT GTC CCT TGC GTT 117 

CGC GAG GGT AAC GCC TCG AAA TGT TGG GTG GCG GTG GCC 156 

CCC ACG GTG GCC ACC AGG GAC GGC AAG CTC CCC GCA ACG 195 

CAG CTT CGA CGT. CAC ATC GAT CTG CTT GTC GGG AGC GCC 234 

ACC CTC TGC TCG GCC CTC TAT GTG GGG GAC TTG TGC GGG 273 

TCT GTC TTC CTT GTC GGC CAA CTG TTC ACC TTC TCC CCC 312 

35 AGA CGC CAC TGG ACA ACG CAA GAC TGC AAC TGT TCT ATC • 351 
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TAC 


CCC 


GGC 


CAT 


ATT 


ACG 


GGT 


CAT 


CGC 


ATG 


GCG 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


GCA 


GCG 


CTG 


GTA 


ATG 


GCG 


CAG 


CTG 


CTC 


AGG 


ATC 


CCG 


CAG 


GCC 


ATC 


TTG 


GAC 


ATG 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG 


GTG 


GTA 


CTG 


TTG 


CTG 


TTT 


ACC 


GGC 


GTC 


GAT 


GCG 











(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 3 : 


CAC 


CAA 


GTG 


CGC 


AAC 


TCT 


ACA 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


AAT 


GAT 


TGC 


CCT 


AAT 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


GAT 


GCC 


ATC 


CTG 


CAC 


GCG 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


GTG 


ACC 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTT 


TCT 


CCC 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GTA 


ATG 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


ATC 


GCT 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG 


GTA 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 







390 
429 
468 
507 
546 
576 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DRl 



15 ■ - ■ -- - 

78 
117 
156 
195 
234 
273 

20 T T ? ?TT GTC GGT CAA CTG TTC ACC TTT TCT CCC ' 312 

351 
390 
429 
468 
507 
546 

25 w*«^xv3 im xxi UL'i- licit; GTT GAT GCG 576 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: DR4 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 
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39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 

10 GTG CTG TTG CTG TTT GCC GGC GTT GAT GCG 576 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



UAL 


LAA 


bib 


r-v , P 




*TV "P 


apa 


nnn 


Li 1 


ti\p 


P1\T 


GTP 


ACC 






1 wV_ 




a it 






Ax 1 


Vjr x 


TAP 


givg 


gpg 


CiCC 






r\ 1 \— . 




\ — 


apg 


ppg 




TGT 


gtp 

V9 X w 


PPT 


TGC 


GTT 

VJ X X 










ALL 








TGG 


\31VJ 


gpg 


GTG 




CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


CAG 


CTC 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


TCT 


CCC 


AGG 


CAC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GTA 


GTA 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 









(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



TAC CAA GTG CGC AAC TCC ACG GGG CTT TAC CAT GTT ACC 3 9 

AAT GAT TGC CCT AAC TCG AGT ATT GTG TAC GAG ACA GCT 78 

GAT GCT ATC CTA CAC GCT CCG GGA TGT GTC CCT TGC GTT 117 

CGT GAG GGT AAC ACC TCG AGG TGT. TGG GTG GCG ATG ACC 156 

CCC ACG GTG GCC ACC AGG GAC GGC AAA CTC CCC GCA ACG 195 

CAG CTT CGA CGT TAC ATC GAT CTG CTT GTC GGG AGC GCC 234 

25 ACC CTC TGT TCG GCC CTC TAC GTG GGG GAC TTG TGC GGG 273 

TCT GTC TTT CTT GTC GGT CAG CTG TTT ACC TTC TCT CCC 312 

AGG CGC CTC TGG ACG ACG CAA GAC TGC AAT TGT TCT ATC 351 

TAT CCC GGC CAT ATA ACG GGT CAT CGC ATG GCA TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACG ACG GCA CTG GTA GTA 429 

GCT CAG CTG CTC CGG ATC CCA CAA GCC ATC TTG GAT ATG 468 

ATC GCT GGT GCT CAC TGG GGA GTC CTA GCG GGC ATA GCG 507 

30 TAT TTC TCC ATG GTG GGA AAC TGG GCG AAG GTC CTA GTG 546 

GTG CTG CTG CTA TTC GCC GGC GTT GAC GCG 576 



( 2 ) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
35 (B) TYPE: nucleic acid 
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TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


ACG 


GGC 


CTT 


TAC 


CAT 


GTC 


ACC 


AAT 


GAC 


TGC 


CCT 


AAC 


TCG 


AGC 


ATT 


GTG 


TAC 


GAG 


ACG 


GCC 


GAT 


ACC 


ATC 


CTA 


CAC 


TCT 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGA 


TGT 


TGG 


GTG 


CCG 


GTG 


GCC 


CCC 


ACA 


GTT 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


GCA 


ACG 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTT 


GGG 


AGC 


GCC 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAT 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


TCT 


GTC 


TTT 


CTT 


GTC 


AGC 


CAG 


CTG 


TTC 


ACT 


ATC 


TCC 


CCC 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAC 


TGT 


TCT 


ATC 


TAC 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


ACG 


GCG 


TTG 


GTA 


ATA 


GCT 


CAG 


CTG 


CTC 


AGG 


GTC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


TAT 


TTC 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


CTA 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTC 


GAT 


GCG 









(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: S18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

39 
78 
117 
156 
195 

10 CAG CTT CGA CGT CAC ATC GAT CTG CTT GTT GGG AGC GCC 234 

273 
312 
351 
390 
429 
468 

, , w>_* wwx w»v, ivjo vj\j*i Vjli_ Lift VjtiU ATA CiCQi 507 
I J TAT TTC TCC CCC ncc m r-./-./-t nmn ^.m„ ~ 54 6 

576 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: SW1 



39 
78 
117 
156 

30 ^CC ?TC ?CC ACT AGG GAC GGC AAA CTC CCT GCA ACG 195 

234 
273 
312 
351 
390 
429 
468 
507 
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(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 7 : 


TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


TCG 


GGC 


CTT 


TAC 


CAT 


GTC 


ACC 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


ACG 


GCC 


GAT 


GCC 


ATT 


CTA 


CAC 


TCT 


CCA 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


CGC 


GAG 


GAT 


GGC 


GCC 


CCG 


AAG 


TGT 


TGG 


GTG 


GCG 


GTG 


GCC 


CCC 


ACA 


GTC 


GCC 


ACT 


AGG 


GAC 


GGC 


AAA 


CTC 


CCT 


GCA 


ACG 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGA 


AGC 


GCC 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


TCT 


GTC 


TTT 


CTC 


GTC 


AGT 


CAA 


CTG 


TTC 


ACG 


TTC 


TCC 


CCC 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGT 


AAC 


TGT 


TCT 


ATC 


TAT 


CCC 


GGC 


CAC 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCC 


ACA 


ACA 


GCG 


CTG 


GTA 


GTA 


GCT 


CAG 


CTG 


CTC 


AGG 


ATC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 
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TAT TTC TCC ATG GTG GGG AAC TGG GCG AAG GTC CTG ATA 546 
GTG CTG TTG CTG TTT TCC GGC GTC . GAT GCG 576 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

3 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
10 (C) INDIVIDUAL ISOLATE: US11 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



TAC CAA GTA CGC AAC TCC ACG GGG CTT TAC CAT GTC ACC 3 9 

AAT GAT TGC CCT AAC TCG AGT ATT GTG TAC GAG GCG GCC 78 

GAT GCC ATC CTG CAC ACT CCG GGG TGT GTT CCT TGC GTT 117 

CGC GAG GGT AAC GCT TCG AGG TGT TGG GTG GCG ATG ACC 156 

15 CCC ACG GTG GCC ACC AGG GAC GGC AAA CTC CCC ACA ACG 195 

CAA CTT CGA CGT CAC ATC GAT CTG CTT GTC GGG AGC GCC 234 

ACC CTC TGT TCG GCC CTC TAC GTG GGG GAC CTG TGC GGG 273 

TCT GTC TTT CTT GTC GGT CAA CTG TTT ACC TTC TCT CCC 312 

AGA CGC CAC TGG ACG ACG CAG GGC TGC AAT TGT TCT ATC 351 

TAT CCC GGC CAT ATA ACG GGT CAC CGC ATG GCA TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACG GCG GCG TTG GTG GTA 429 

20 GCT CAG CTG CTC CGG ATC CCA CAA GCC ATC TTG GAC ATG 468 

ATC GCT GGT GCT CAC TGG GGA GTC CTA GCG GGC ATA GCG 507 

TAT TTC TCC ATG GTG GGG AAC TGG GCG AAG GTC CTG GTA 54 6 

GTG CTG CTG CTA TTT GCC GGC GTC GAC GCG 576 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 (C) INDIVIDUAL ISOLATE: Dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAT GTC ACG 39 

AAC GAC TGT TCC AAC TCG AGC ATT GTG TAT GAG ACA GCG 78 

GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 117 

CGG GAG GAC AAC TCC TCT CGC TGC TGG GTA GCG CTC ACC 156 

35 CCC ACG CTC GCG GCT AGG AAT GGC AAC GTC CCC ACT ACG 195 
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GCG ATA CGA CGC CAC GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCC ATG TAC GTG GGG GAT CTC TGC GGA 273 

TCT GTT TTC CTC ATC TCC CAG CTG TTC ACC CTC TCG CCT 312 

CGC CGG CAT GAG ACG GTA CAG GAG TGT AAT TGC TCA ATC 351 

TAT CCC GGC CAC GTG ACA GGT CAC CGT ATG GCT TGG GAT 390 

ATG ATG ATG AAC TGG TCA CCT ACA ACA GCC TTA GTG GTA 429 

TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC ATG GAC ATG 468 

3 GTG GCG GGG GCC CAC TGG GGG GTC CTG GCG GGC CTC GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 54 6 

GTG ATG CTA CTC TTT GCT GGC GTT GAC GGC 576 

(2) INFORMATION FOR SEQ ID NO: 10: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: D3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 

39 
78 
117 

20 CGG GAG GAC AAC TCC TCT CGC TGC TGG GTA GCG CTC ACC 156 

195 
234 
273 
312 
351 
390 

_ — - - — * — a ww nwi vxwi \jv_\_ UiU U1A 429 

Zj TCCZ PACI tti rrp can stp ^r>» otv j\ /—^<t> pup nmp nmj-i 468 

507 
546 
576 

(2) INFORMATION FOR SEQ ID NO: 11: 

30 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: homosapiens 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAA 


GTC 


ACC 


AAT 


GAC 


TGT 


TCC 


AAC 


TCG 


AGC 


ATC 


GTG 


TAT 


GAG 


ACA 


GCG 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


CGG 


GAG 


GAC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACC 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


AGC 


AGC 


GTC 


CCC 


ACT 


ACG 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAT 


CTT 


TGC 


GGA 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAA 


TGT 


AAC 


TGC 


TCA 


ATC 


TAT 


CCC 


GGC 


CAC 


GTG 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTC 


GCC 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTC 


GAC 


GGC 
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(C) INDIVIDUAL ISOLATE: DK1 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAC GTC ACA 3 9 

AAC GAC TGC TCC AAC TCA AGC ATC GTG TAT GAG GCA GTG 78 

- GAC GTG ATC ATG CAT ACC CCA GGG TGC GTG CCC TGC GTT 117 

^ CGG GAG AAC AAC CAC TCC CGT TGC TGG GTA GCG CTC ACC 156 

CCC ACG CTC GCG GCC AGG AAC GCC AGC ATC CCC ACT ACG 195 

ACA ATA CGA CGC CAT GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCT ATG TAC GTG GGG GAC CTC TGC GGA 273 

TCC GTT TTC CTC GTC TCT CAG CTG TTC ACC TTT TCA CCT 312 

CGC CGG CAT GAG ACA GCA CAG GAC TGC AAC TGC TCA ATC 351 

TAT CCC GGC CAC GTT TCA GGT CAC CGC ATG GCT TGG GAT 3 90 

10 ATG ATG ATG AAC TGG TCA CCT ACA ACA GCC CTA GTG CTA 429 

TCG CAG TTA CTC CGA ATC CCA CAA GCT GTC GTG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTC GCC 507 

TAC TAC TCC ATG GCG GGG AAC TGG GCC AAG GTT TTA ATT 54 6 

GTG TTG CTA CTC TTT GCC GGC GTT GAT GGG 576 



(2) INFORMATION FOR SEQ ID NO : 12 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 < vi > ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 

. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



TAT GAA GTG CGC AAC GTG TCC GGG ATA TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGC GTC GTG TAT GAG ACA GCA 78 

25 GAC ATG ATC ATG CAT ACC CCT GGA TGC GTG CCC TGC GTA 117 

CGG GAG AAC AAC TCC TCC CGC TGT TGG GTA GCG CTC ACT 156 

CCC ACG CTC GCG GCC AGG AAC GTC AGC GTC CCC ACC ACG 195 

ACA ATA CGA CGT CAC GTC GAC TTG CTC GTT GGG GCG GCT 234 

GCC TTC TGC TCC GCT ATG TAC GTG GGG GAT CTC TGC GGA 273 

TCT GTT TTC CTT GTC TCC CAG CTG TTC ACC TTC TCG CCT 312 

CGC CGA CAC GAG ACA GTA CAG GAC TGC AAC TGC TCA CTC 351 

30 TAT CCC GGC CAC GTA TCA GGT CAC CGC ATG GCT TGG GAT 3 90 

ATG ATG ATG AAC TGG TCC CCT ACA GCA GCC CTA GTG GTG 429 

TCG CAA TTA CTC CGG ATC CCG CAA GCT GTC GTG GAC ATG 4 68 

GTG GCG GGG GCC CAC TGG GGA GTC CTA GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTG GGA AAC TGG GCT AAG GTT TTG ATT 546 

GTG ATG CTA CTT TTT GCC GGC GTT GAT GGG 576 



35 
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° (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

CAT GAA GTG CAC AAC GTA TCC GGG ATC TAC CAT GTC ACG 39 

10 AAC GAC TGC TCC AAC TCA AGT ATT GTG TAT GAG GCA GCG 78 

GAC ATG ATC ATG CAT ACC CCC GGG TGC GTG CCC TGC GTC 117 

CGG GAG AAC AAC TCC TCC CGT TGC TGG GTA GCG CTC ACT 156 

CCC ACG CTC GCG GCC AGG AAC GCC AGC ATC CCC ACT ACG 195 

ACA ATA CGA CGC CAT GTC GAC TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCC ATG TAC GTG GGA GAT CTC TGC GGA 273 

TCT GTC TTC CTC GTC TCC CAG TTG TTC ACC TTC TCG CCT 312 

CGC CGG CAT GAG ACG GTA CAG GAC TGC AAT TGC TCA ATC 351 

13 TAT CCC GGC CAC GTA TCA GGT CAC CGC ATG GCT TGG GAT 3 90 

ATG ATG ATG AAC TGG TCA CCT ACA GCA GCC CTA GTG GTA 429 

TCG CAG TTA CTC CGA CTC CCA CAA GCT GTC ATG GAC ATG 468 

GTG GCG GGA GCC CAC TGG GGA GTC CTA GCG GGC CTT GCT 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCC AAG GTT TTG ATT 54 6 
GTG ATG CTA CTC TTT GCC GGC GTT GAC GGG 



20 



30 



(2) INFORMATION FOR SEQ ID NO:14: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
2 5 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


AAC 


GAC 


TGC 


TCC 


AAC 


TTA 


AGC 


ATC 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


CGG 


GAA 


AAC 


AAC 


TCC 


TCC 


CGT 


TGT 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


GCA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 



576 



l A aAc CAT GTC ACG 39 

TG TAC GAG ACA ACG 78 

SC GTG CCC TGC GTT 117 

SG GTA GCG CTC GCC 156 

3C GTC CCC ACC ACG 195 

PC GTT GGG GCG GCT 234 

SG GAT CTT TGC GGA 273 

**w w*w wxw x^v- w*vj uhjt i TC ACC TTC TCG CCT -aio 

35 CGC CGA rar irr. „™ ™ ^ TGC TCA ^ 
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TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


GTA 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


TAC 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 











(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 15: 


TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATC 


GTG 


TAT 


GAA 


ACA 


GCG 


GAC 


ATG 


ATT 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


ATG 


CCC 


TGC 


GTT 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


TGC 


TGG 


GTG 


GCG 


CTC 


ACT 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


GTC 


AGC 


GTC 


CCC 


ACT 


ACG 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTT 


TCG 


CCT 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


TAC 


TAT 


TCC 


ATG 


GTG 


GGC 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


GTG 


ATG 


CTA 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 









390 
429 
468 
507 
546 
576 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY : linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 



15 TAT f3AA dTCi mr A AH Tff nan ATA TAC CAT OTC IfR 3 9 

78 
117 

156 
195 
234 
273 

20 TCT GTT TTC CTC GTC TCC CAG CTG TTC ACC TTT TCG CCT 312 

351 
390 
429 
468 
507 
546 

576 

25 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND5 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 
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WO 96/05315 



PCTAJS95/10398 
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5 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 












i CT 




TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


ccc 


ACT 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


TCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CAC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTA 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCC 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 








576 



(2) INFORMATION FOR SEQ ID NO: 17: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

13 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND8 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



25 



TAT 


GAG 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TTC 


TCT 


AGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCT 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCG 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 








576 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(yi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGT ATT GTG TAT GAG GCA GCG 78 

GAC ATG ATA ATG CAC ACC CCC GGG TGC GTG CCC TGT GTT 117 

CGG GAG AAC AAC TCC TCC CGC TGC TGG GTA GCG CTC ACT 156 

CCC ACA CTC GCG GCT AGG AAT TCC AGC GTC CCA ACT ACG 195 

10 GCA ATA CGA CGC CAT GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCT ATG TAC GTG GGG GAT CTC TGC GGA 273 

TCT GTT CTC CTC GTC TCC CAG CTG TTC ACC TTC TCA CCT 312 

CGC CGG CAT TGG ACA GTA CAG GAC TGC AAT TGT TCA ATC 351 

TAT CCT GGC CAC GTA TCA GGT CAC CGC ATG GCT TGG GAT 3 90 

ATG ATG ATG AAC TGG TCG CCC ACA GCA GCC CTA GTG GTG 429 

TCG CAG CTA CTC CGG ATC CCA CAA GCT ATC TTG GAT GTG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC 507 

15 TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTC TTG ATT 546 

GTG ATG CTA CTC TTT GCC GGC GTT GAC GGA 576 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 



TAT GAA GTG CGC AAC GTA TCC GGG GCG TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGT ATT GTG TAC GAG GCA GCG 78 

GAC GTG ATC ATG CAT ACC CCC GGG TGT GTA CCC TGC GTT 117 

CAG GAG GGT AAC TCC TCC CAA TGC TGG GTG GCG CTC ACC 156 

30 CCC ACG CTC GCG GCC AGG AAC GCT ACC GTC CCC ACC ACG 195 

ACA ATA CGA CGT CAT GTC GAT TTG CTC GTT GGG GCG GCT 234 

GTT TTC TGC TCC GCT ATG TAC GTG GGG GAC CTG TGC GGA 273 

TCT GTT TTC CTC ATC TCC CAG CTG TTC ACC ATC TCG CCC 312 

CGT CGG CAT GAG ACA GTA CAG AAC TGC AAT TGC TCA ATC 351 

TAT CCC GGA CAC GTG ACA GGT CAT CGC ATG GCC TGG GAT 390 

ATG ATG ATG AAC TGG TCG CCT ACA ACA GCC CTA GTG GTA 429 

TCG CAG CTA CTC CGG ATC CCA CAA GCT GTC ATG GAT ATG 468 

35 GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTC GCC 507 



BNSDOC'D: <WO 9605315*3 ia> 
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TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 
GTG ATG CTA CTT TTT GCT GGT GTT GAC GGG 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

TAT GAA GTG CGC AAC GTG TCC GGG GCG TAC CAT GTC ACG 
AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GTG 
GAC GTG ATC CTG CAC ACC CCT GGG TGC GTG CCC TGC GTT 
CGG GAG AAC AAC TCC TCC CGT TGC TGG GTG GCG CTC ACT 
CCC ACG CTC GCG GCC AGG AAC TCC AGC GTC CCC ACT ACG 
ACA ATA CGA CGT CAC GTC GAT TTG CTC GTT GGG GCG GCT 
GCT TTC TGC TCC GCT ATG TAC GTG GGG GAT CTC TGC GGA 
TCT GTT TTC CTT GTT TCC CAG CTG TTC ACC TTC TCG CCT 
CGT CGG CAT GAG ACA GTA CAG GAC TGC AAC TGT TCA ATC 
TAT CCC GGC CAC GTA ACA GGT CAC CGC ATG GCT TGG GAT 
ATG ATG ATG AAC TGG TCG CCT ACA GCA GCC TTA GTG GTA 
TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC GTG GAC ATG 
GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC 
TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT CTG ATT 
GTG ATG CTA CTC TTT GCC GGC GTT GAC GGG 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM:- homosapiens 
(C) INDIVIDUAL ISOLATE: SA10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TAT GAA GTG CGC AAC GTG TCC GGG ATG TAC CAT GTC ACG 
AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GCG 
GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 
CGG GAG AAC AAC TCC TCC CGC TGC TGG GTA GCG CTC ACT 
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5 



ccc 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


TCC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


TAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CGC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCT 


CTA 


GTA 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTT 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 








576 



(2) INFORMATION FOR SEQ ID NO: 22: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAT 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GCC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTA 


GCA 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GTT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTT 


TCA 


CCT 


312 


CGC 


CGG 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTG 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTA 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCA 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTT 


GAC 


GGG 








576 



30 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 



BNSDOCIO: <WO . 9605315*3 IA> 
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(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE : T3 



10 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 23 : 


TAC 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


TAT 


GTC 


ACG 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


CGG 


GAG 


AGC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTT 


ACT 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACT 


AAG 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTC 


TCG 


CCT 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGT 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACG 


GCA 


CTA 


GTG 


GTG 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


GTG 


CTG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 









39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
576 

15 (2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

i*i vjAA GTG CGC AAC GTG TCC GGG ATG TAC CAT GTC ACG 39 

*J IIP f3ar* Tfin TV-V J\ 71 mo TV T. i-io * mm /-ims-, mm^, „ - ^ g 

117 
156 
195 
234 
273 
312 

30 E^T ? AG _ ACT CAG GAC TGC AAC TGC TCA ATC 3 51 

- - - 39Q 

429 
466 
507 
546 
576 

35 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TTT 


GAG 


GCA 


GCG 


GAC 


TTG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


CGG 


GAG 


GGC 


AAC 


TCC 


TCC 


CGC 


-TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACT 


ACG 


ACG 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAT 


GTG 


GGA 


GAC 


CTC 


TGC 


GGA 


TCT 


GTT 


TTC 


CTC 


GTC 


TCT 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


CGC 


CGG 


CAT 


GAG 


ACT 


TTG 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


TAT 


CCC 


GGC 


CAT 


CTG 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAC 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


ACA 


GCT 


CTA 


GTG 


GTG 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


GTG 


ACA 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


TAC 


TAT 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTA 


ATT 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 







SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



- 97 - 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
(C) INDIVIDUAL ISOLATE: US 6 



10 TAT GAA GTG CGC AAC GTG TCC GGG ATG TAC CAT GTC ACG 3 9 

78 
117 
156 
195 
234 
273 

312 

15 CGT CAG CAT GAG ACA GTA CAG GAC TGC AAT TGT TCA ATC 351 

390 
429 
468 
507 
546 
576 



20 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 25 : 


TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC 


GTG 


CCC 


TGT 


GTT 


CGG 


GAG 


AAC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCT 


AGC 


GTC 


CCC 


ACT 


ACG 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


ACT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGG 


TCC 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


CGT 


CAG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA 


ATC 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


CTG 


ATT 


GTG 


TTG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 









(2) INFORMATION FOR SEQ ID NO: 26 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 



30 

78 
117 
156 
195 
234 
273 

35 GGG GTG ATG CTC GCA GCC CAG ATG TTC ATT GTC TCG CCG 312 





(Xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO:26: 


GCC 


CAA 


GTG 


AGG 


AAC 


ACC 


AGC 


CGC 


GGT 


TAC 


ATG 


GTG 


ACT 


AAC 


GAC 


TGT 


TCC 


AAT 


GAG 


AGC 


ATC 


ACC 


TGG 


CAG 


CTC 


CAA 


GCC 


GCG 


GTT 


CTC 


CAC 


GTC 


CCC 


GGG 


TGT 


ATC 


CCG 


TGT 


GAG 


AGG 


CTG 


GGA 


AAT 


ACA 


TCC 


CGA 


TGC 


TGG 


ATA 


CCG 


GTC 


ACA 


CCA 


AAC 


GTG 


GCC 


GTG 


CGG 


CAG 


CCC 


GGC 


GCT 


CTT 


ACG 


CAG 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


ACG 


CTC 


TGC 


TCT 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


GGG 


GTG 


ATG 


CTC 


GCA 


GCC 


CAG 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 



BNSDOC!D*<WO 9605315A3 tA> 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



CGA CGC CAC TGG TTT 
TAC CCC GGT ACC ATC 
ATG ATG ATG AAC TGG 
GCG TAC GCG ATG CGC 
ATC GGC GGG GCT CAC 
TAC TTC TCT ATG CAG 
ATC CTC TTG CTG GCT 



- 98 - 



GTG CAA GAA TGC AAT 
ACT GGA CAC CGT ATG 
TCG CCC ACA GCC ACC 
GTT CCC GAG GTC ATC 
TGG GGC GTC ATG TTT 
GGA GCG TGG GCG AAG 
GCT GGG GTG GAC GCG 



TGC TCC ATC 3 51 

GCA TGG GAC 390 

ATG ATC CTG 42 9 

ATA GAC ATC 468 

GGC TTG GCC 507 

GTC ATT GTC 54 6 

576 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T4 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 



GCA 


CAA 


GTG 


AAG 


AAC 


ACC 


ACT 


AAC 


AGC 


TAC 


ATG 


GTG 


ACC 


AAC 


GAC 


TGT 


TCT 


AAT 


GAC 


AGC 


ATC 


ACT 


TGG 


CAG 


CTC 


CAG 


GCC 


GCG 


GTC 


CTC 


CAC 


GTC 


CCC 


GGG 


TGT 


GTC 


CCG 


TGC 


GAG 


AAA 


ACG 


GGA 


AAT 


ACA 


TCT 


CGG 


TGC 


TGG 


ATA 


CCG 


GTT 


TCA 


CCA 


AAC 


GTG 


GCC 


GTG 


CGG 


CAG 


CCC 


GGC 


GCC 


CTC 


ACG 


CAG 


GGC 


TTG 


CGG 


ACG 


CAC 


ATT 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


ACG 


CTC 


TGC 


TCT 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


GGG 


GTG 


ATG 


CTC 


GCA 


GCC 


CAG 


ATG 


TTC 


ATC 


GTC 


TCG 


CCG 


CAA 


CAT 


CAC 


TGG 


TTT 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATC 


TAC 


CCT 


GGC 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACC 


ATG 


ATC 


CTG 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


TTA 


GAC 


ATC 


GTT 


AGC 


GGG 


GCA 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


TTG 


GCC 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


ATC 


CTT 


CTG 


CTG 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 







39 
78 
117 
156 
195 
234 

20 ACG CTC TGC TCT GCT CTT TAC GTG GGG GAC CTC TGC GGC 273 

312 
351 
390 
429 
468 
507 

j. xv_ /in: u«j va»a GCG TGG GCG AAA GTC GTT GTC 546 

ZD ATP CTT fTfi r"TV? n /->/-> rr"T> r*t->i-< nmn /■••» n SIS 



(2) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs • • 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

35 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



FCT/US95/10398 



- 99 - 

o 



GCC GAA GTG 
AAT GAC TGT 
GCC GCG GTC 
AGA GTT GGA 
CCA AAC GTA 
GGC TTG CGG 
5 ACG CTC TGC 
GGG GTA ATG 
CAG CAC CAC 
TAC CCT GGT 
ATG ATG ATG 
GCG TAC GCG 
ATC AGC GGA 
10 TAC TTC TCT 
ATC CTG TTG 



AAG AAC ACC 
TCC AAC GAC 
CTC CAC GTC 
AAC GCG TCG 
GCT GTG CAG 
ACG CAC ATC 
TCC GCT CTC 
CTC GCC GCT 
TGG TTT GTG 
ACC ATC ACT 
AAC TGG TCG 
ATG CGC GTT 
GCT CAC TGG 
ATG CAG GGA 
CTC ACC GCT 



AGT ACC AGC 
AGC ATC ACC 
CCC GGG TGC 
CGG TGC TGG 
CGG CCT GGC 
GAC ATG GTT 
TAC GTG GGG 
CAG ATG TTC 
CAG GAA TGC 
GGA CAC CGT 
CCC ACA ACC 
CCC GAG GTC 
GGC GTC ATG 
GCG TGG GCG 
GGC GTG GAC 



TAC ATG GTG 
TGG CAA CTC 
GTC CCG TGC 
ATA CCG GTC 
GCC CTC ACG 
GTG ATG TCC 
GAT CTC TGC 
ATT ATC TCG 
AAC TGC TCC 
ATG .GCA TGG 
ACC ATG ATC 
ATC ATA GAC 
TTC GGC CTA 
AAG GTC GTT 
GCG 



ACA 39 

CAG 78 

GAG 117 

TCG 156 

CAG 195 

GCC 234 

GGC 273 

CCG 312 

ATT 351 

GAC 390 

TTG 42 9 

ATC 468 

GCC 507 

GTC 546 
576 



15 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US10 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 29: 


GTC 


CAA 


GTG 


AAA 


AAC 


ACC 


AGT 


ACC 


AGC 


TAT 


ATG 


GTG 


ACC 


AAT 


GAC 


TGC 


TCC 


AAC 


GAC 


AGC 


ATC 


ACT 


TGG 


CAA 


CTT 


GAG 


GCT 


GCG 


GTC 


CTC 


CAC 


GTT 


CCC 


GGG 


TGT 


GTC 


CCG 


TGC 


GAG 


AAA 


GTG 


GGA 


AAT 


ACA 


TCT 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCA 


CCA 


AAT 


GTG 


GCC 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


GGC 


TTG 


CGG 


ACT 


CAC 


ATC 


GAC 


ATG 


GTC 


GTG 


ATG 


TCC 


GCC 


ACG 


CTC 


TGC 


TCC 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


TTC 


TGC 


GGT 


GGG 


ATG 


ATG 


CTC 


GCA 


GCC 


CAA 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


CGC 


CAC 


CAC 


TCG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATC 


TAC 


CCC 


GGT 


ACC 


ATC 


ACC 


GGG 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACT 


TTG 


ATC 


CTG 


GCG 


TAC 


GTG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


ATT 


AGC 


GGG 


GCG 


CAT 


TGG 


GGC 


GTC 


TTG 


TTC 


GGC 


TTA 


GCC 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


ATC 


CTT 


CTG 


CTA 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 









39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 

30 ATT AGC GGG GCG CAT TGG GGC GTC TTG TTC GGC TTA GCC 507 

___ 546 

576 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 576 base pairs 



BNSDOCtD: <WO 960531 5A3 IA=. 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US9S/10398 



10 



- 100 - 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 

wx-w »v.v. tui van viAW U'l'A GCC CTC CAG GTT 468 

507 
546 
576 



(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
2 5 < c > INDIVIDUAL ISOLATE: DK11 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 

«ww + nJ . «v.v_ v-Oi Kill K.K.L VaAt* LTA GTC CTT GAA GTC 468 
35 GTC TTC CZCZT CZCZT MT <vnn /wr /-"m _ 



30 



GTG 


GAA 


GTC 


AGG 


AAC 


ATC 


AGT 


TCC 


AGC 


TAC 


TAC 


GCC 


ACC 


AAT 


GAT 


TGC 


TCA 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


GAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


AAT 


GAC 


AAT 


GGC 


ACC 


CTG 


CGC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCA 


CTT 


ACT 


CAT 


AAC 


CTG 


CGA 


ACA 


CAC 


GTC 


GAC 


GTG 


ATC 


GTA 


ATG 


GCA 


GCT 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


GTA 


TGC 


GGG 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


CTC 


ATA 


ATA 


TCG 


CCT 


GAA 


CGC 


CAC 


AAC 


TTT 


ACC 


CAG 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


TAC 


CAA 


GGT 


CAT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


ATG 


ATG 


CTA 


AAC 


TGG 


TCA 


CCA 


ACT 


CTT 


ACC 


ATG 


ATC 


CTC 


GCC 


TAT 


GCC 


GCT 


CGT 


GTT 


CCT 


GAG 


CTA 


GCC 


CTC 


CAG 


GTT 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


TAT 


TTC 


TCC 


ATG 


CAG 


GGA 


GCG 


TGG 


GCC 


AAA 


GTC 


ATT 


GCC 


ATC 


CTC 


CTT 


CTT 


GTC 


GCA 


GGA 


GTG 


GAT 


GCA 




(2) 


INFORMATION FOR SEQ ID 


NO: 31 : 











GTG 


GAA 


GTC 


AGG 


AAC 


ACC 


AGT 


TCT 


AGT 


TAC 


TAC 


GCC 


ACC 


AAT 


GAT 


TGC 


TCA 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


AAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


AAT 


GAC 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCA 


CTC 


ACT 


CAC 


AAC 


CTG 


CGA 


GCA 


CAT 


ATA 


GAT 


ATG 


ATT 


GTA 


ATG 


GCA 


GCT 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


GTG 


TGC 


GGG 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC 


ATA 


GTA 


TCG 


CCA 


GAA 


CAC 


CAC 


CAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


TAC 


CAA 


GGT 


CAC 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


ATG 


ATG 


CTT 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


GCC 


TAT 


GCC 


GCC 


CGT 


GTT 


CCT 


GAG 


CTA 


GTC 


CTT 


GAA 


GTC 


GTC 


TTC 


GGT 


GGT 


CAT 


TGG 


GGT 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 



507 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US9S/10398 • 



- 101 - 

o 

TAT TTC TCC ATG CAG GGA GCG TGG GCC AAG GTC ATT GCC 546 
ATC CTC CTT CTT GTA GCA GGA GTG GAT GCA 576 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
10 (C) INDIVIDUAL ISOLATE: SW3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

39 
78 
117 
156 

15 CCT AAT GTG GCT GTG AAA CAC CGC GGC GCG CTC ACT CAC 195 

234 
273 
312 
351 
390 
429 

20 GCC TAT GCC GCT CGT GTT CCT GAG CTA GTC CTT GAA GTT 468 

507 

546 
576 

(2) INFORMATION FOR SEQ ID NO: 33: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

GTG GAA GTT AGA AAC ACC AGT TTT AGC TAC TAC GCC ACC 39 

AAT GAT TGC TCG AAC AAC AGC ATC ACC TGG CAG CTC ACC 78 

AAC GCA GTT CTC CAC CTT CCC GGA TGC GTC CCA TGT GAG 117 

35 AAT GAC AAT GGC ACC TTG CGC TGC TGG ATA CAA GTA ACA 156 



GTG 


GAA 


GTC 


AGG 


AAC 


ATC 


AGT 


TCT 


AGC 


TAC 


TAT 


GCC 


ACC 


AAT 


GAT 


TGC 


TCA 


AAC 


AGC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


AAC 


GCA 


GTC 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCG 


TGT 


GAG 


AAT 


GAT 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCG 


CTC 


ACT 


CAC 


AAC 


CTG 


CGA 


GCA 


CAC 


GTC 


GAT 


ATG 


ATC 


GTA 


ATG 


GCA 


GCT 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


ATG 


TGC 


GGG 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


GAA 


CGC 


CAC 


AAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


TAC 


CAA 


GGT 


CGT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAC 


ATG 


ATG 


CTA 


AAC 


TGG 


TCA 


CCA 


ACT 


CTT 


ACC 


ATG 


ATC 


CTT 


GCC 


TAT 


GCC 


GCT 


CGT 


GTT 


CCT 


GAG 


CTA 


GTC 


CTT 


GAA 


GTT 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


TAT 


TTC 


TCC 


ATG 


CAA 


GGA 


GCG 


TGG 


GCC 


AAG 


GTC 


ATT 


GCC 


ATC 


CTC 


CTG 


CTT 


GTC 


GCA 


GGA 


GTG 


GAT 


GCA 
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5 



CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGT 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


ACG 


CAT 


GTC 


GAC 


GTG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGG 


GAC 


GTG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATA 


GCG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


GAA 


CGC 


CAC 


AAC 


TTC 


ACC 


CAG 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTG 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAC 


GCT 


GCT 


CGT 


GTG 


CCT 


GAA 


CTA 


GTC 


CTT 


GAA 


GTT 


468 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAA 


GGA 


GCG 


TGG 


GCC 


AAA 


GTC 


ATC 


GCC 


546 


ATC 


CTC 


CTC 


CTT 


GTC 


GCA 


GGA 


GTG 


GAC 


GCA 








576 



(2) INFORMATION FOR SEQ ID NO: 34: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

15 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S83 

(xi) SEQUENCE 

GTG GAG GTC AAG GAC ACC 

AAC GAT TGC TCC AAC TCT 
20 GGA GCA GTG CTT CAT ACT 

CGT ACC GCC AAC GTC TCT 

CCC AAT CTC GCC ATA AGT 

GGC CTG CGA GCA CAC ATC 

ACG GTC TGT TCT GCC CTT 

GCG CTG ATG CTG GCC GCT 

CAA CAC CAT ACG TTT GTC 

TAC CCG GGC CGC ATT ACG 
25 ATG ATG ATG AAC TGG TCG 

GCG TAC TTG GTG CGC ATC 

GTT ACA GGA GGT CAT TGG 

TAC TTC TCC ATG CAG GGA 

ATC CTC CTG CTG ACT GCT 



30 (2) INFORMATION FOR 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear- 

35 (vi) ORIGINAL SOURCE: 



DESCRIPTION: SEQ ID NO: 34: 



GGC 


GAC 


TCC 


TAC 


ATG 


CCG 


ACC 


39 


AGT 


ATC 


GTT 


TGG 


CAG 


CTT 


GAA 


78 


CCT 


GGA 


TGC 


GTC 


CCT 


TGT 


GAG 


117 


CGA 


TGT 


TGG 


GTG 


CCG 


GTT 


GCC 


156 


CAA 


CCT 


GGC 


GCT 


CTC 


ACT 


AAG 


195 


GAT 


ATC 


ATC 


GTG 


ATG 


TCT 


GCT 


234 


TAT 


GTG 


GGG 


GAC 


GTG 


TGT 


GGC 


273 


CAG 


GTC 


GTC 


GTC 


GTG 


TCG 


CCA 


312 


CAG 


GAA 


TGC 


AAC 


TGT 


TCC 


ATA 


351 


GGA 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT . 


390 


CCC 


ACT 


ACC 


ACC 


ATG 


CTC 


CTG 


429 


CCG 


GAA 


GTC 


ATC 


TTG 


GAT 


ATT 


468 


GGT 


GTA 


ATG 


TTT 


GGC 


CTC 


GCT 


507 


TCG 


TGG 


GCG 


AAG 


GTC 


ATC 


GTT 


54 6 


GGG 


GTG 


GAG 


GCG 








576 
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(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: 



TTA GAG TGG CGG AAT GTG TCC GGC CTC TAC GTC CTT ACC 3 9 

AAC GAC TGT TCC AAT AGC AGT ATC GTG TAT GAG GCC GAT 78 

5 GAC GTC ATT CTG CAC ACA CCT GGC TGT GTA CCT TGT GTT 117 

CAG GAC GGC AAT ACA TCT ACG TGC TGG ACC TCA GTG ACG 156 

CCT ACA GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCT 195 

TCG ATA CGC AGT CAT GTG GAC CTG CTA GTG GGC GCG GCC 234 

ACG ATG TGC TCT GCG CTC TAC GTG GGT GAT GTG TGT GGG 273 

GCC GTC TTC CTT GTG GGA CAA GCC TTC ACG TTC AGA CCT 312 

CGT CGC CAT CAA ACA GTC CAG ACC TGT AAC TGC TCG CTG 351 

10 TAC CCA GGC CAT CTT TCA GGA CAT CGA ATG GCT TGG GAT 390 

ATG ATG ATG AAT TGG TCC CCC GCT GTG GGT ATG GTG GTA 429 

GCG CAC GTC CTG CGT CTG CCC CAG ACC TTG TTC GAC ATA 468 

ATA GCT GGG GCC CAT TGG GGC ATC ATG GCG GGC CTA GCC 507 

TAT TAC TCC ATG CAG GGC AAC TGG GCC AAG GTC GCT ATC 546 

ATC ATG GTT ATG TTT TCA GGA GTC GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 



CTA GAG TGG CGG AAT GTG TCT GGC CTC TAT GTC CTT ACC 39 

25 AAC GAC TGT CCC AAT AGC AGT ATT GTG TAT GAG GCC GAT 78 

GAC GTC ATT CTG CAC ACA CCT GGC TGT GTA CCT TGT GTT 117 

CAG GAC GGC AAT ACA TCC ACG TGC TGG ACC TCG GTG ACA 156 

CCT ACA GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCC 195 

TCG ATA CGC AGT CAT GTG GAC CTG TTA GTG GGC GCG GCC 234 

ACG ATG TGC TCT GCG CTC TAC GTG GGC GAT ATG TGT GGG 273 

GCC GTC TTC CTC GTG GGA CAA GCC TTC ACG TTC AGA CCG 312 

30 CGT CGC CAT CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG 351 

TAC CCA GGC CAC CTT TCA GGA CAT CGA ATG GCT TGG GAT 390 

ATG ATG ATG AAT TGG TCC CCC GCC GTG GGT ATG GTG GTG 429 

GCG CAC GTC CTG CGG TTG CCC CAG ACC TTG TTC GAC ATA 468 

ATA GCC GGG GCC CAT TGG GGC ATC TTG GCA GGC CTA GCC 507 

TAT TAC TCC ATG CAG GGC AAC TGG GCC AAG GTC GCT ATC 546 

ATC ATG GTT ATG TTT TCA GGG GTC GAT GCC 576 



35 
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(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTC 


ACC 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


GAC 


GTT 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


CAG 


GAC 


GGT 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAT 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTG 


GTG 


GGC 


GCG 


GCC 


ACT 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGC 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


GCG 


CAC 


GTT 


CTG 


CGT 


TTG 


CCC 


CAG 


ACC 


GTG 


TTC 


GAC 


ATA 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


TAT 


TAC 


TCC 


ATG 


CAA 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAC 


GCC 





39 
78 
117 
156 
195 
234 
273 

www wo.w AJ .i wxw oivj w\»*i lm liww TTC ACG TTC AGA CCT 312 

351 
390 
429 
468 
507 
546 

576 

20 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
,e (C) STRANDEDNESS: single 

(D). TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCC 


GGC 


TGT 


GTA 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ATG 


TGC 


TGG 


ACC 


CCT 


ACG 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTA 


GTG 


ACG 


CTG 


TGC 


TCT 


GCG 


CTC 


TAT 


GTG 


GGT 


GAT 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 



TC CTT ACC 3 9 

\G GCC GAT 78 

-T TGT GTT 117 

2A GTG ACA 156 

TC ACC GCT 195 

3C GCG GCC 234 

..ww w*w iwi vjv.\j uiu x*vi <jiw Wil <jAT ATG TGT GGG 273 

33 GCC GTC TTT PTP CZTrz nni r»a mr>r> i^r* * r^r* t^C AGA CCT 312 
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CGT CGC CAT CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG 351 

TAC CCA GGC CAT GTT TCA GGA CAT CGA ATG GCT TGG GAT 390 

ATG ATG ATG AAT TGG TCC CCC GCT GTG GGT ATG GTG GTG 42 9 

GCG CAC ATC CTG CGA TTG CCC CAG ACC TTG TTT GAC ATA 46 8 

CTG GCC GGG GCC CAT TGG GGC ATC TTG GCG GGC CTA GCC 507 

TAT TAT TCT ATG CAG GGC AAC TGG GCC AAG GTC GCT ATT 54 6 

GTC ATG ATT ATG TTT TCA GGG GTC GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO:39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S54 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: 



CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT ATC CTT ACC 3 9 

AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT 78 

GAC GTC ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT 117 

CAG GAC GGC AAT ACA TCC ACG TGC TGG ACC CCA GTG ACA 156 

CCT ACG GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCT 195 

TCG ATA CGC AGT CAT GTG GAC CTA TTA GTG GGC GCG GCC 234 

20 ACG CTG TGC TCT GCG CTC TAT GTG GGT GAT ATG TGT GGG 273 

GCC GTC TTT CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT 312 

CGT CGC CAT CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG 351 

TAC CCA GGC CAT CTT TCA GGA CAT CGA ATG GCT TGG GAT 3 90 

ATG ATG ATG AAT TGG TCC CCC GCT GTG GGT ATG GTG GTG 429 

GCG CAC ATC CTG CGA TTG CCC CAG ACC TTG TTT GAC ATA 4 68 

CTG GCC GGG GCC CAT TGG GGC ATC TTG GCG GGC CTA GCC 507 

TAT TAT TCT ATG CAG GGC AAC TGG GCC AAG GTC GCT ATC 546 

25 ATC ATG ATT ATG TTT TCA GGG GTC GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 



BNSOOCID: <WO 9fi053i5a? 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



- 106 - 

o 



GAG CAC TAC CGG AAT GCT TCG GGC 
AAT GAT TGT CCG AAT TCC AGT ATA 
CAT CAC ATC CTA CAC TTG CCG GGG 
ATG ACT GGG AAC ACA TCG CGT TGC 
CCT ACA GTG GCT GTC GCA CAC CCG 
TCG TTC CGG CGA CAT GTG GAC TTA 
ACT TTG TGT TCT GCC CTC TAT GTT 
GGT GCC TTC CTG ATG GGG CAG ATG 
CGT CGC CAC TGG ACC ACG CAG GAG 
TAC ACT GGC CAT ATC ACC GGC CAC 
ATG ATG ATG AAC TGG AGC CCT ACC 
GCC CAG ATC ATG AGG GTC CCC ACA 
GTT GCC GGA GGC CAC TGG GGC GTC 
10 TAC TTC AGC ATG CAA GGC AAT TGG 
GTC CTT TTC CTC TTT GCT GGG GTA 



ATC TAT CAC ATC ACC 3 9 

GTC TAT GAA GCT GAC 78 

TGC GTA CCC TGT GTG 117 

TGG ACG CCG GTG ACG 156 

GGC GCT CCG CTT GAG 195 

ATG GTA GGC GCG GCC 234 

GGG GAC CTC TGC GGA 273 

ATC ACT TTT CGG CCG 312 

TGC AAT TGT TCC ATC 3 51 

AGG ATG GCG TGG GAC 390 

ACC ACT CTG CTC CTC 42 9 

GCC TTT CTC GAC ATG 4 68 

CTC GCG GGC TTG GCG 5 07 

GCC AAG GTA GTC CTG 54 6 

GAC GCC 576 



15 



20 



(2) INFORMATION FOR SEQ ID NO: 41: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 



GTG 


CAC 


TAC 


CGG 


AAT 


GCT 


TCG 


GGC 


GTC 


TAT 


CAT 


GTC 


ACC 


AAT 


GAT 


TGC 


CCT 


AAC 


ACC 


AGC 


ATA 


GTG 


TAC 


GAG 


ACG 


GAG 


CAC 


CAC 


ATC 


ATG 


CAC 


TTG 


CCA 


GGG 


TGT 


GTC 


CCC 


TGT 


GTG 


CGG 


ACG 


GAG 


AAT 


ACT 


TCT 


CGC 


TGC 


TGG 


GTG 


CCC 


TTG 


ACC 


CCC 


ACT 


GTG 


GCC 


GCG 


CCC 


TAT 


CCC 


AAC 


GCA 


CCG 


TTA 


GAG 


TCC 


ATG 


CGC 


AGG 


CAT 


GTA 


GAC 


CTG 


ATG 


GTG 


GGT 


GCG 


GCT 


ACT 


ATG 


TGT 


TCC 


GCC 


TTC 


TAC 


ATT 


GGA 


GAT 


CTG 


TGT 


GGA 


GGC 


GTC 


TTC 


CTA 


GTG 


GGC 


CAG 


CTG 


TTC 


GAC 


TTC 


CGA 


CCG 


CGC 


CGG 


CAC 


TGG 


ACC 


ACC 


CAG 


GAT 


TGC 


AAC 


TGC 


TCC 


ATC 


TAT 


CCT 


GGT 


CAC 


GTC 


TCG 


GGC 


CAC 


AGG 


ATG 


GCC 


TGG 


GAC 


ATG 


ATG 


ATG 


AAC 


TGG 


AGC 


CCT 


ACC 


AGC 


GCG 


CTG 


ATT 


ATG 


GCT 


CAG 


ATC 


TTA 


CGG 


ATC 


CCC 


TCT 


ATC 


CTA 


GGT 


GAC 


TTG 


CTC 


ACC 


GGG 


GGT 


CAC 


TGG 


GGA 


GTT 


CTT 


GCT 


GGT 


CTA 


GCT 


TTC 


TTC 


AGC 


ATG 


CAG 


AGT 


AAC 


TGG 


GCG 


AAG 


GTC 


ATC 


CTG 


GTC 


CTA 


TTC 


CTC 


TTT 


GCC 


GGG 


GTC 


GAG 


GGA 





39 
78 
117 
156 

_^v. «v. ± oiu ov,v> v^\j <_uu tat U (_"(_' AAC GCA CCG TTA GAG 195 

234 
273 
312 
351 
390 
429 
468 

30 ™E ?¥l GG .I 9^9. TG<3 GGA GTT CTT GCT GGT CTA GCT 507 

546 
576 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 576 base pairs 
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(B) TYPE: nucleic acid 

( C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 



GTT AAC TAT CGC AAT GCC. TCG GGC GTC TAT CAC GTC ACC 3 9 

AAC GAC TGC CCG AAC TCG AGC ATA GTG TAT GAG GCC GAA 78 

CAC CAG ATC TTA CAC CTC CCA GGG TGC TTG CCC TGT GTG 117 

AGG GTT GGG AAT CAG TCA CGC TGC TGG GTG GCC CTT ACT 156 

10 CCC ACC GTG GCG GTG TCT TAT ATC GGT GCT CCG CTT GAC 195 

TCC CTC CGG AGA CAT GTG GAC CTG ATG. GTG GGC GCC GCT 234 

ACT GTA TGC TCT GCC CTC TAC GTT GGA GAT CTG TGC GGT 2 73 

GGT GCA TTC TTG GTT GGC CAG ATG TTC TCC TTC CAG CCG 312 

CGA CGC CAC TGG ACT ACG CAG GAC TGC AAT TGT TCT ATC 3 51 

TAC GCA GGG CAT ATC ACG GGC CAC AGG ATG GCA TGG GAC 3 90 

ATG ATG ATG AAC TGG AGT CCC ACA ACC ACC CTG CTT CTC 42 9 

GCC CAG GTC ATG AGG ATC CCT AGC ACT CTG GTA GAT CTA 46 8 

15 CTC GCT GGA GGG CAC TGG GGC GTC CTT GTT GGG TTG GCG 507 

TAC TTC AGT ATG CAA GCT AAT TGG GCC AAA GTC ATC CTG 54 6 

GTC CTT TTC CTC TTC GCT GGA GTT GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO: 43: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 



GTC AAC TAT CAC AAT GCC TCG GGC GTC TAT CAC ATC ACC 3 9 

AAC GAC TGC CCG AAC TCG AGC ATA ATG TAT GAG GCC GAA 78 

CAC CAC ATC CTA CAC CTC CCA GGG TGC GTA CCC TGT GTG 117 

30 AGG GAG GGG AAC CAG TCA * CGC TGC TGG GTG GCC CTT ACT 156 

CCC ACC GTG GCG GCG CCT TAT ATC GGT GCA CCG CTT GAA 195 

TCC ATC CGG AGA CAT GTG GAC CTG ATG GTA GGC GCT GCT 234 

ACA GTG TGC TCC GCT CTC TAC ATT GGG GAC CTG TGC GGT 273 

GGC GTA TTT TTG GTT GGT CAG ATG TTT TCT TTC CAG CCG 312 

CGA CGC CAC TGG ACT ACG CAG GAC TGC AAT TGT TCC ATC 3 51 

TAT GCG GGG CAC GTT ACA GGC CAC AGA ATG GCA TGG GAC 390 

ATG ATG ATG AAC TGG AGT CCC ACA ACC ACC TTG GTC CTC 42 9 

35 GCC CAG GTT ATG AGG ATC CCT AGC ACT CTG GTG GAC CTA 468 
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CTC ACT GGA GGG CAC TGG GGT ATC CTT ATC GGG GTG GCA 507 

TAC TTC TGC ATG CAA GCT AAT TGG GCC AAG GTC ATT CTG 54 6 

GTC CTT TTC CTC TAC GCT GGA GTT GAT GCC 576 

(2) INFORMATION FOR SEQ ID NO:44: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK13 



15 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 44 : 


TAC 


AAC 


TAT 


CGC 


AAC 


AGC 


TCG 


GGT 


GTC 


TAC 


CAT 


GTC 


ACC 


AAC 


GAT 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTC 


TAT 


GAA 


ACC 


GAT 


TAC 


CAC 


ATC 


TTA 


CAC 


CTC 


CCG 


GGA 


TGC 


GTT 


CCT 


TGC 


GTG 


AGG 


GAA 


GGG 


AAC 


AAG 


TCT 


ACA 


TGC 


TGG 


GTG 


TCT 


CTC 


ACC 


CCC 


ACC 


GTG 


GCT 


GCG 


CAA 


CAT 


CTG 


AAT 


GCT 


CCG 


CTT 


GAG 


TCT 


TTG 


AGA 


CGT 


CAC 


GTG 


GAT 


CTG 


ATG 


GTG 


GGC 


GGC 


GCC 


ACT 


CTC 


TGC 


TCC 


GCC 


CTC 


TAC 


ATC 


GGA 


GAC 


GTG 


TGT 


GGG 


GGT 


GTG 


TTC 


TTG 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


CAA 


CCT 


CGC 


CGC 


CAC 


TGG 


ACC 


ACC 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


TAC 


ACA 


GGA 


CAT 


ATC 


ACA 


GGA 


CAC 


AGA 


ATG 


GCT 


TGG 


GAC 


ATG 


ATG 


ATG 


AAT 


TGG 


AGC 


CCC 


ACT 


GCG 


ACG 


CTG 


GTC 


CTC 


GCC 


CAA 


CTT 


ATG 


AGG 


ATC 


CCA 


GGC 


GCC 


ATG 


GTC 


GAC 


CTG 


CTT 


GCA 


GGC 


GGC 


CAC 


TGG 


GGC 


ATT 


CTG 


GTT 


GGC 


ATA 


GCG 


TAC 


TTC 


AGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTT 


ATC 


CTG 


GTC 


CTG 


TTT 


CTC 


TTT 


GCT 


GGA 


GTC 


GAC 


GCT 









39 
78 
117 
156 
195 
234 
273 
312 
351 
390 

20 ATG ATG ATG AAT TGG AGC CCC ACT GCG ACG CTG GTC CTC 429 

468 
507 
546 
576 



25 (2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 ( vi > ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GTT CCC TAC CGG AAT GCC TCT GGG GTT TAC CAT GTC ACC 39 
AAT GAC TGC CCA AAC TCC TCC ATA GTC TAC GAG GCT GAT 78 
35 AGC CTG ATC TTG CAC GCA CCT GGC TGC GTG CCC TGT GTC 117 
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5 





OTV TV 

LAA 


(sAl 


TV TV *P 


Vslv. 


AGT 


AGG 


TGC 


TGG 


GTC 


LAA 


TV > 1 ■/ < 

ATC 


TV 

ACC 


lbo 


ccc 


ACA 


CTG 


TCA 


GCC 


CCG 


ACC 


TTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGA 


GCT 


234 


GCT 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGC 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


CTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACC 


ACA 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


CTG 


ATG 


429 


GCC 


CAG 


ATG 


CTA 


CGG 


ATC 


CCC 


CAG 


GTG 


GTC 


ATA 


GAC 


ATC 


468 


ATA 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTT 


GCC 


GCC 


GCA 


507 


TAC 


TTT 


GCG 


TCG 


GCC 


GCC 


AAC 


TGG 


GCT 


AAG 


GTA 


GTG 


CTG 


54 6 


GTT 


CTG 


TTC 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 








576 



10 (2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 



GTT 


CCC 


TAC 


CGA 


AAC 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTT 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATC 


TTG 


CAT 


GCA 


CCT 


GGT 


TGC 


GTG 


CCT 


TGT 


GTC 


117 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AAG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACG 


TTG 


TCA 


GCC 


CCG 


AAT 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACG 


GCC 


TTG 


CTG 


ATG 


429 


GCC 


CAG 


TTG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATC 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTT 


GCC 


GCC 


GCA 


507 


TAT 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


ATA 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 








576 



30 (2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 
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(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO : 4 7 : 


GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTC 


TAC 


GAG 


GCT 


GAT 


AAC 


CTG 


ATT 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


AAG 


GAA 


GGT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


CCT 


CTT 


CGG 


AGG 


GTC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


GCA 


GTG 


TTC 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


CGC 


CAG 


CAT 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


TAC 


AGC 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


TGG 


GAC 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


GCC 


CAG 


GTG 


CTA 


CGG 


ATT 


CCC 


CAA 


GTG 


GTC 


ATT 


GAC 


ATC 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GTC 


GCA 


TAC 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 







39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
576 

15 (2) INFORMATION FOR SEQ ID NO:48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

« AA ^.CT TAC CGG AAT GCC TCT GGG GTG TAT CAT GTT ACC 3 9 

78 
117 
156 
195 
234 
273 
312 

30 E?E A ? G GTA CAG GAC T GC AAC TGC TCC ATT 351 

390 
429 
468 
507 
54 6 
576 

35 



GTT 


CCT 


TAC 


CGG 


AAT 


GCC 


TCT 


GGG 


GTG 


TAT 


CAT 


GTT 


ACC 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTC 


TAT 


GAG 


GCT 


GAT 


GAC 


CTG 


ATC 


CTA 


CAC 


GCA 


CCT 


GGC 


TGC 


GTG 


CCC 


TGT 


GTC 


CGG 


AAG 


GAT. 


AAT 


GTC 


AGT 


AGA 


TGC 


TGG 


GTT 


CAT 


ATC 


ACC 


CCC 


ACA 


CTA 


TCA 


GCC 


CCG 


AGC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAT 


TAC 


TTG 


GCG 


GGA 


GGG 


GCC 


GCC 


CTG 


TGC 


TCC 


GCG 


TTA 


TAC 


GTC 


GGA 


GAC 


GTG 


TGC 


GGG 


GCA 


TTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


CGC 


CAG 


CAT 


GCT 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCC 


ATT 


TAC 


AGT 


GGC 


CAT 


ATC 


ACT 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCC 


GCG 


ACA 


GCC 


TTG 


GTG 


ATG 


GCC 


CAA 


ATG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCT 


GCA 


TAC 


TTC 


GCG 


TCG 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTT 


GAT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

( C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

10 GTC CCC TAG CGA AAT GCC TCC GGG 

AAT GAT TGC CCG AAC TCT TCC ATA 

AAC CTG ATC CTG CAC GCA CCT GGT 

AGA CAA AAT AAT GTC AGT AGG TGC 

CCC ACA TTG TCA GCC CCG AAC CTC 

CCT CTT CGG AGG GCC GTT GAC TAC 

GCC CTC TGC TCC GCG CTA TAC GTC 

GCA GTG TTT TTG GTA GGC CAG ATG 
15 CGC CAG CAC ACT ACG GTG CAG GAC 

TAC AGT GGC CAT ATC ACC GGC CAC 

ATG ATG ATG AAT TGG TCA CCT ACG 

GCC CAG TTG CTA CGG ATT CCC CAG 

ATT GCC GGG GGC CAC TGG GGG GTC 

TAT TTC GCG TCA GCG GCT AAC TGG 

GTC TTG TTT CTG TTT GCG GGG GTC 

20 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
-(B) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 



30 GTT CCC TAC CGA AAT GCC TCT GGG GTT TAT CAT GTC ACC 3 9 

AAT GAT TGC CCA AAC TCT TCC ATC GTC TAC GAG GCT GAT 78 

GAC CTG ATC TTA CAC GCA CCT GGT TGC GTG CCC TGT GTT 117 

AGG CAG GGT AAT GTC AGT AGG TGC TGG GTC CAG ATC ACC 156 

CCC ACA CTG TCA GCC CCG AGC CTC GGA GCG GTC ACG GCT 195 

CCT CTT CGG AGG GCC GTT GAC TAC TTA GCG GGG GGG GCT 234 

GCC CTT TGC TCC GCG TTA TAC GTC GGA GAC GCG TGC GGG 273 

GCA GTG TTT TTG GTA GGT CAA ATG TTC ACC TAT AGC CCT 312 

35 CGC CGG CAT AAT GTT GTG CAG GAC TGC AAC TGT TCC ATT 3 51 



GTT TAT CAT GTC ACC 3 9 

GTC TAT GAG GCT GAC 78 

TGC GTG CCC TGT GTC 117 

TGG GTC CAA ATC ACC 156 

GGA GCG GTC ACG GCT 195 

CTA GCG GGA GGG GCT 234 

GGG GAC GCG TGC GGG 273 

TTC AGC TAT AGG CCT 312 

TGC AAC TGT TCC ATT 3 51 

CGA ATG GCA TGG GAC 3 90 

ACA GCC TTG GTG ATG 429 

GTG GTC ATC GAC ATC 468 

TTG TTC GCC GCC GCA 507 

GCT AAG GTT GTG CTG 546 

GAT GCC 576 
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TAC 


AGT 


GGC 


CAC 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACA 


ACA 


GCT 


TTG 


GTG 


ATG 


GCC 


CAG 


TTG 


TTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


ATT 


GCC 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


TAC 


TAC 


GCG 


TCG 


GCG 


GCT 


AAC 


TGG 


GCC 


AAG 


GTT 


GTG 


CTG 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 











(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:! 


51 : 


CTT 


ACC 


TAC 


GGC 


AAC 


TCC 


AGT 


GGG 


CTA 


TAC 


CAT 


CTC 


ACA 


AAT 


GAT 


TGC 


CCC 


AAC 


TCC 


AGC 


ATC 


GTG 


CTG 


GAG 


GCG 


GAT 


GCT 


ATG 


ATC 


TTG 


CAT 


TTG 


CCT 


GGA 


TGC 


TTG 


CCT 


TGT 


GTG 


AGG 


GTC 


GAT 


GAT 


CGG 


TCC 


ACC 


TGT 


TGG 


CAT 


GCT 


GTG 


ACC 


CCC 


ACC 


CTG 


GCC 


ATA 


CCA 


AAT 


GCT 


TCC 


ACG 


CCC 


GCA 


ACG 


GGA 


TTC 


CGC 


AGG 


CAT 


GTG 


GAT 


CTT 


CTT 


GCG 


GGC 


GCC 


GCA 


GTG 


GTT 


TGC 


TCA 


TCC 


CTG 


TAC 


ATC 


GGG 


GAC 


CTG 


TGT 


GGC 


TCT 


CTC 


TTT 


TTG 


GCG 


GGA 


CAA 


CTA 


TTC 


ACC 


TTT 


CAG 


CCC 


CGC 


CGT 


CAT 


TGG 


ACT 


GTG 


CAA 


GAC 


TGC 


AAC 


TGC 


TCC 


ATC 


TAT 


ACA 


GGC 


CAC 


GTC 


ACC 


GGC 


CAC 


AGG 


ATG 


GCT 


TGG 


GAC 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCC 


ACA 


ACC 


ACT 


CTG 


GTC 


CTA 


TCT 


AGC 


ATC 


TTG 


AGG 


GTA 


CCT 


GAG 


ATT 


TGT 


GCG 


AGT 


GTG 


ATA 


TTT 


GGT 


GGC 


CAT 


TGG 


GGG 


ATA 


CTA 


CTA 


GCC 


GTT 


GCC 


TAC 


TTT 


GGC 


ATG 


GCT 


GGC 


AAC 


TGG 


CTA 


AAA 


GTT 


CTG 


GCT 


GTT 


CTG 


TTC 


CTA 


TTT 


GCA 


GGG 


GTT 


GAA 


GCA 







390 
429 
468 
507 
546 
576 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK2 



15 

39 
78 
117 
156 
195 
234 
273 

20 1ST EIE IT! IT? GGA CAA CTA TTC ACC TTT CAG CCC 312 

3 51 
390 
429 
468 
507 
546 

yj - LJ - AAV - '"J--"- Ail <-><~A Q5TT GAA GCA 576 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 
30 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK7 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
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Tyr 


f2l ti 


Val 


Arg 


Asn 


C A V 


i nr 


ni vr 
oiy 




ryr 


ni a 


v.ai 


X A1X 




A en 










c 










i o 
i \j 










X -J 


Cys 


Pro 


Asn 


Ser 


Ser 


lie 


vai. 


Tyr 




Aia 


A "I a 
Ala 


nop 


r\l a 


Tip 

lie 


JjcU 
















25 










30 


xllo 


J. Ill 


riu 


vjiy 




V=a 1 
V ax 


riO 


cys 


Val 


Arg 


f^l IV 




Ren 


v ax 


Car 
OCX 




















40 










45 


Arg 


cys 


irp 


Va 1 
Val 


ai a 




i nr 


Pro 


i nr 


Val 


Al 
riX a 


1 Hi 


Arg 


nop 


w x y 










3 u 




















0 


T ^ r o 

Jjys 


JjcU 


riu 


1 liXT 


ai a 

nla 


uin 


Leu 


Arg 


nig 


nis 


Tip 

lie 




UC Li. 


T .01 1 


Val 
v ax 










O ZD 










70 










75 




ser 


ax a 


i nr 


Leu 


Cys 


Ser 


Ala 

Ax a 


Leu 


Tyr 


Val 


oiy 


ASp 


Leu 


cys 










a n 
o u 










85 










on 


taiy 


Ser 


vai 


riie 


T a 1 1 

Lieu 


vai 


c»iy 


bin 


Lieu 


Phe 


Thr 


rne 


Ser 


Pro 


Arg 




















100 










105 


Arg 


His 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro 










140 










145 










150 


Gin 


Ala 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val 


Leu 










155 










160 










165 


Ala 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Leu 


Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 









185 190 



(2) INFORMATION FOR SEQ ID NO: 53: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 



30 



35 



Tyr 


Gin 


Val 


Arg 


Asn 
5 


Ser 


Ser 


Gly 


Leu 


Tyr 
10 


His 


Val 


Thr 


Asn 


Asp 
15 


Cys 


Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala 


He 


Leu 










20 








25 








30 


His 


Ser 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Gly 


Asn 


Ala 


Ser 
45 


Lys 


Cys 


Trp 


Val 


Ala 
50 


Val 


Ala 


Pro 


Thr 


Val 
55 


Ala 


Thr 


Arg 


Asp 


Gly 
60 


Lys 


Leu 


Pro 


Ala 


Thr 
65 


Gin 


Leu 


Arg 


Arg 


His 
70 


He 


Asp 


Leu 


Leu 


Val 
75 
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o 



Gly 


Ser 


Ala 


Thr 


Leu 


Cvs 


Ser 


Ala 


JUCZ Li 




V d _L 


Gly Asp 


Lieu 


Cys 


Gly 








80 










85 










9 n 
j \j 


Ser 


Val 


Phe 


Leu 


Val 


Glv 


Gin 


UC 14. 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 




His 






95 










100 










1 AC 
1UD 


Arg 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys" 


Asn 


Cys 


Ser 


He 


Tyr 


r*i O 


vj±y 


His 


He 






110 










115 








120 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro 


Gin 


Ala 






140 










145 








150 


lie 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val 


Leu 


Ala 


Gly 






155 










160 






165 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly Asn 


Trp 


Ala 


Lys 


Val 


Val 


Val 






170 










175 






180 


Val 


Leu 


Leu 


Leu 


Phe 


Thr 


Gly 


Val 


Asp 


Ala 














185 








190 











(2) INFORMATION FOR SEQ ID NO : 54 : 

(i) SEQUENCE CHARACTERISTICS: 

J4 - (A) LENGTH: 192 amino acids 

13 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR1 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



25 



His 


Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr 


Asn 


Asp 


Cys 








5 










10 










15 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala 


He 


Leu 


His 


Ala 






20 










25 








30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Ala 


Ser 


Arg 








35 










40 








45 


Cys 


Trp 


Val 


Ala 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg 


Asp 


Gly 










50 










55 






60 


Lys 


Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu 


Leu 


Val 


Gly 








65 










70 








75 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 


Gly 








80 










85 








90 


Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 




His 






95 










100 










105 


Arg 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 


His 


He 


Thr 




110 










115 








120 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 


Ser 




Thr 




125 










130 










135 


Pro 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro. 










140 










145 








150 
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10 



15 



20 



25 



- 115 - 

Gin Ala lie Leu Asp Met lie Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly lie Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Val Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 





(xi) 


SEQUENCE 


DESCRIPTION: 


; SEQ ID 


NO: 55: 






His 


Gin. 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr 


Asn 


Asp 










5 










10 










15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala 


He 


Leu 










20 










25 










30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Thr 


Ser 










35 










40 










45 


Arg 


Cys 


Trp 


Val 


Ala 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg 


Asp 


Gly 










50 










55 










60 


Lys 


Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu 


Leu 


Val 










65 










70 










75 


Gly 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 










80 










85 










90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 










95 










100 










105 


His 


His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro 










14 0 










145 










150 


Gin 


Ala 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val 


Leu 










155 










160 










165 


Ala 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Leu 


Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 
















1B5 










190 













30 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 192 amino acids 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 
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10 



15 



20 



25 



- 116 - 



(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 


56 : 






Tyr 


Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr 


• 

Asn 


Asp 










5 










10 










± D 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Ala 


lie 


Leu 


His 








20 










25 








j V 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly Asn 


Tnr 


Ser 










35 










40 










45 


Arcj 


Cys 


Trp 


Val 


Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg 


Asp 


Gly 










50 










55 






60 


Lys 


Leu 


Pro 


Ala 


Thr 


Gin 


Leu 


Arg 


Arg 


Tyr 


He 


Asp 


Leu 


Leu 


Val 


Gly 








65 










70 








75 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 


Gly 








80 










85 










90 


Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 










95 










100 










105 


Arg 


Leu 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 


His 








110 










115 








120 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro 


Gin 


Ala 






140 










145 








150 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val 


Leu 


Ala 


Gly 






155 










160 






165 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 










170 










175 








180 


Leu 


Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 
















185 








190 











(2) INFORMATION FOR SEQ ID NO: 57 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 
30 < D > TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57 



35 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 
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10 



15 



30 



Tyr 


l?XIl 


val. 




noli 


Ser 


TV) Y" 

1 XIX 


Glv 


Leu 


Tvr 


His 


Val 


Thr 


Asn 


Asp 








C 
O 










1 0 

X w 










15 


Cys 


PrO 


Asn 


O 0 v* 

OCX 


OCX 


Tip 

X XC 


val 


1 J J: 


OX LI 


Thr 

X ill 


Ala 


Asp 


Thr 


He 


Leu 




























30 


His 


OCX 


Prn 


vjx y 


5 


Val 


Pro 


Cys 


Val 


Arcr 
nx ^ 

40 


Glu 


Gly 


Asn 


Ala 


Ser 
45 




(^i;e 

jr © 


Trn 

irp 


Val 


Prn 


Val 

v a x 


Ala 


Pro 


Thr 


Val 


Ala 


Thr 


Arc? 


Asp 


Gly 
























60 


Lys 


Leu 


Pro 


Ala 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu 


Leu 


Val 


















70 










75 


Gly 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 








0 U 








fit; 










90 


r* 1 ■xr 

vixy 


Ser 


Vdl 


riic 


Leu 


VO.X 


C A 

OCX 


VJlli 


XJC Li 


Phe 


Thr 


Tie 

X X c 


Cp r 

t^w X 


Pro 


«x y 








■ 0 c 










100 










105 

X. V/ J 


Arg 


His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 








11U 










115 










X ^ w 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


He 


Ala 


Gin 
145 


Leu 


Leu 


Arg 


Val 


Pro 
150 


Gin 


Ala 


Val 


Leu 


Asp 
155 


Met 


He 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly 


Val 


Leu 
165 


Ala 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Ala 


Gly Asn 


Trp 


Ala 


Lys 


Val 








170 










175 










180 


Leu 


Leu 


Val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Ala 









(2) INFORMATION FOR SEQ ID NO: 58: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: SW1 



35 





(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO: 58 : 






Tyr 


Gin 


Val 


Arg 


Asn 


Ser 


Ser 


Gly 


Leu 


Tyr 


His 


Val 


Thr 


Asn 


Asp 








5 










10 










15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Ala 


He 


Leu 








20 








25 










30 


His 


Ser 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


Val 


Arg 


Glu 


Asp 


Gly Ala 


Pro 










35 










40 










45 


Lys 


Cys 


Trp 


Val 


Ala 


val 


Ala 


Pro 


Thr 


Val 


Ala 


Thr 


Arg 


Asp 


Gly 




50 










55 










60 


Lys 


Leu 


Pro 


Ala 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu 


Leu 


Val 








65 










70 










75 


Gly 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 








80 








85 










90 
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Gly Ser Val 
Arg His Trp 
His He Thr 
Ser Pro Thr 
Gin Ala Val 
Ala Gly He 
Leu He Val 



Phe Leu Val 
95 

Thr Thr Gin 

110 
Gly His Arg 

125 
Thr Ala Leu 

140 
Leu Asp Met 

155 

Ala Tyr Phe 
170 

Leu Leu Leu 
185 



- 118 - 

Ser Gin Leu 
Asp Cys Asn 
Met Ala Trp 
Val Val Ala 
He Ala Gly 
Ser Met Val 
Phe Ser Gly 



Phe Thr Phe 
100 

Cys Ser He 
115 

Asp Met Met 
130 

Gin Leu Leu 
145 

Ala His Trp 
160 

Gly Asn Trp 
175 

Val Asp Ala 
190 



Ser Pro Arg 
105 

Tyr Pro Gly 
120 

Met Asn Trp 
135 

Arg He Pro 
150 

Gly Val Leu 
165 

Ala Lys Val 
180 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US11 



20 



25 





(xi> 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:! 


59 : 






Tyr 


Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr 


Asn 


Asp 


Cys 








5 










10 










15 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala 


He 


Leu 


His 


Thr 






20 










25 








30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Ala 


Ser 


Arg 








35 










40 








45 


Cys 


Trp 


Val 


Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg 


Asp 


Gly 


Lys 








50 










55 






60 


Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu 


Leu 


Val 


Gly 




Ala 




65 










70 








75 


Ser 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 


Gly 








80 










85 








90 


Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 


Arg 


His 






95 










100 










105 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 


His 


He 


Thr 


Gly 


110 










115 








120 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 


Ser 








125 










130 










135 


Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro 


Gin 


Ala 


He 




14 0 










145 








150 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val 


Leu 










155 










160 






165 
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Ala Gly lie Ala Tyr 

170 

Leu Val Val Leu Leu 

185 



- 119 - 

Phe Ser Met Val Gly 

175 

Leu Phe Ala Gly Val 

190 



Asn Trp Ala Lys Val 

180 

Asp Ala 



(2) INFORMATION FOR SEQ ID NO: 60: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 



15 



20 



25 



30 



Tyr 


Glu 


Val 


Arg 


Asn 
5 


Val 


Ser 


Gly 


Val 


Tyr 
10 


His 


Val 


Thr 


Asn 


Asp 
15 


Cys 


Ser 


Asn 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Thr 
25 


Ala 


Asp 


Met 


He 


Met 
30 


His 


Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Asp 


Asn 


Ser 


Ser 
45 


Arg 


Cys 


Trp 


Val 


Ala 
50 


Leu 


Thr 


Pro 


Thr 


Leu 
55 


Ala 


Ala 


Arg 


Asn 


Gly 
60 


Asn 


Val 


Pro 


Thr 


Thr 
65 


Ala 


He 


Arg 


Arg 


His 
70 


Val 


Asp 


Leu 


Leu 


Val 
75 


Gly 


Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val 


Gly 


Asp 


Leu 


Cys 
90 


Gly 


Ser 


val 


Phe 


Leu 
95 


He 


Ser 


Gin 


Leu 


Phe 
100 


Thr 


Leu 


Ser 


Pro 


Arg 
105 


Arg 


His 


Glu 


Thr 


Val 
110 


Gin 


Glu 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr 


Pro 


Gly 
120 


His 


Val 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu 


Leu 


Arg 


lie 


Pro 
150 


Gin 


Ala 


Val 


Met 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly 


Val 


Leu 
165 


Ala 


Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Val 


Gly 
175 


Asn 


Trp 


Ala 


Lys 


Val 
180 


Leu 


He 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 









(2) INFORMATION FOR SEQ ID NO: 61: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknown 



BNSDOCO- <WO 9605^1 l*> 
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- 120 - 

o 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr Gin Val Thr Asn Asp 

Cys Ser Asn Ser Ser lie Val Tyr Glu Thr Ala Asp Met He Met 

His Thr Pro Gly Cys Val Pro Cys Val Ar| Glu Asp Asn Ser Ser 

10 Arg Cys Trp Val Ala Leu Thr Pro Thr Leu" Ala Ala Arg Asn Ser 

Ser Val Pro Thr Thr Thr He Arg Arg nil Val Asp Leu Leu VaJ 

65 *7q 
Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cy5 

Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 

15 Arg His Glu Thr Val Gin Glu Cys Asn c£s Ser He Tyr Pro Gly 

His 
125 
Ala 

^ Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val ttu 



35 



His Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn ^ 

125 130 i *ac 

Ser Pro Thr Ala Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 



160 165 

Gly Asn Trp Ala Lys Val 

. _ 1 V 5 

Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 18 ° 



Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 



170 
Leu 

185 190 



25 (2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK1 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 



10 



15 



SUBSTITUTE SHEET (RULE 26) 
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- 121 



10 



25 



30 





Q ^ T" 


no 11 


wC X 




He 


Val 


Tyr 


Glu 


Ala 


Val 


Asp 


Val 


He 


Met 








20 










25 










30 


ClJLo 


X XIX 


IT X V*> 


O J. jr 


^* y cs 


Val 

vax 


Pro Cys 


Val 

V ui. 


**xy 


Glu 


Asn 


Asn 


His 


Ser 










35 










40 










45 


/AX y 




iirp 




Ala 

^^X G. 


T 1^11 

- 1JC u 


i nr 


Pro 


Thr 




Ala 


Ala 


Arg 


Asn 


Ala 






50 










55 










60 


C a 

OCX 


Tip 
lie 


.fx O 


1 11X 




TVrr 

1 41X 


lie 


Arg 




70 


Val 
v ax 






T .f^u 


Val 

75 




Ala 


>\ia 


x-i-L a 


crie 


uys 


Ser 


Ala 


1*1 G L. 


lyx 


v ax * 




Asp 


Leu 


w y o 


















85 










90 


oiy 


Ser 


Val 


true 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 




Ser 


Pro 


nX y 








95 










100 










105 


Arg 


HIS 




inr 


Aia 
110 


bin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


lie 


Tyr 


Pro 


vyly 
120 


His 


Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Leu 


Ser 


Gin 
145 


Leu 


Leu 


Arg 


He 


Pro 
150 


Gin 


Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val 


Leu 










155 










160 










165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Leu 


He 


Val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 









15 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid - 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



35 



Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


He 


Tyr 


His 


Val 


Thr 


Asn 


Asp 










5 










10 










15 


Cys 


Ser 


Asn 


Ser 


Ser 


Val 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met 


He 


Met 








20 










25 










30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Asn 


Ser 


Ser 










35 










40 










45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Val 










50 










55 










60 


Ser 


Val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 










65 










70 










75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 










80 










85 










90 
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O 



uiy 


ot=x 


V cLX 


rile 


Leu 


vax 


Ser 


uin 


Leu Phe 
100 


Thr 


Pne 


Ser 


Pro 


Arg 

105 


Arg 


rl J. S 


blu 


i nr 


v ax 
110 




Asp 


Cys 


Asn Cys 
115 


Ser 


Leu 


Tyr 


Pro 


Gly 
120 


His 


Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser Gin 


Leu 


Leu 


Arg 


He 


Pro 










140 








145 








150 


Gin 


Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly Ala 


His 


Trp 


Gly 


Val 


Leu 










155 








160 








165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly Asn 


Trp 


Ala 


Lys 


Val 










170 








175 






180 


Leu 


He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Gly 
















185 








190 











(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
i:> (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

20 



His 


Glu 


Val 


His 


Asn 
5 


Val 


Ser 


Gly 


He 


Tyr 
10 


His 


Val 


Thr 


Asn 


Asp 
15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Met 


He 


Met 


His 








20 










25 








30 


Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Asn 


Asn 


Ser 


Ser 
45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Ala 




He 






50 










55 








60 


Ser 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 






Ala 




65 










70 








75 


Gly Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 


Gly 








80 










85 








90 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 




His 


Glu 




95 










100 










105 


Arg 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 


His 








110 










115 








120 


Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


Leu 


Pro 










140 










145 








150 


Gin 


Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly Ala 


His 


Trp 


Gly Val 


Leu 










155 










160 








165 
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Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 190 



(2) INFORMATION FOR SEQ ID NO: 65: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK5 



15 



20 



25 



30 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 65: 






Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


Asn 


Asp 










5 










10 










15 


Cys 


Ser 


Asn 


Leu 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Thr 


Asp 


Met 


He 


Met 










20 










25 










30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Asn 


Ser 


Ser 










35 










40 










45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Ala 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Ala 










50 










55 










60 


Ser 


Val 


Pro 


Thr 


Thr 


Ala 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 










65 










70 










75 


Gly 


Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 










80 










85 










90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 










95 










100 










105 


Arg 


His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


His 


Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He 


Pro 










140 










145 










150 


Gin 


Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val 


Leu 










155 










160 










165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Leu 


He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 
















185 










190 













(2) INFORMATION FOR SEQ ID NO: 66 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknown 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <:WO . 960531 5 A3 (A> 



WO 96/05315 



PCT/US95/10398 



10 



15 



20 



- 124 - 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 



Tvr 


Glu 


Val 




fieri 
noil 

C 


veil 


Cor 




X-Le 


Tyr 


HIS 


Val 


Thr 


Asn 


Cvs 


Ser 


Asn 


Ser 


Cpy* 


11C 


Val 


iyr 


Glu 


10 
Thr 


Ala 


Asp 


Met 


He 


His 








2 Q 










25 








Thr 


Pro 


Glv 








Lys 


Val 


Arg 


Glu 


Asn 


Asn 


Ser 










■a c 
3 5 










40 










Ara 


Cys 










1 lii 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 




Val 






o u 










55 








Ser 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Gly 








65 










70 








Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


85 
Phe 


Thr 


Phe 


Ser 


Pro 




His 






95 










100 










Arg 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


His 


Val 






110 










115 








Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 










125 










130 










Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He 


Gin 


Ala 






140 










145 








He 


Val 


Asp 


Met 


Val 


Ala 


Gly Ala 


His 


Trp 


Gly 


Val 


Ala 


Gly 






155 










160 






Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly Asn 


Trp 


Ala 


Lys 




He 






170 










175 






Leu 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Gly 














185 










190 







15 
Met 

30 
Ser 

4 5 
Val 

60 
Val 

75 
Cys 

90 
Arg 
105 
Gly 
120 
Trp 
135 
Pro 
150 
Leu 
165 
Val 
180 



(2) INFORMATION FOR SEQ ID NO: 67: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 < A > ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Ala Ala Asp Met He Met 
35 20 25 30 
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125 - 



10 



15 



30 



His 


Thr 


Pro 


Glv 


w y o 

3 5 


Val 


Or o 




V cix 


Aror 
40 


Glu 

V3X \JL 


Gly 


Asn 


r 

OCX 


Q^r 

OCX 

45 


Arg 


Cys 


X X p 


Val 

V G& X. 


Ala 
50 


XJC U 


Thr 

X XXX - 


Pro 
c x u 


Thr 

X XXX 


XJC u 

55 


Ala 

X"\X CI 


Ala 


Arg 


A en 


Al a 

/lX CI 

60 


Ser 


Val 


C o Y~ 

OCX 


Thr 


Thr 

1 XXX 

65 


Thr 
X XXX 


Tip 


nxy 


XIX o 


nx o 
70 


vax 


Asp 


Leu 


T .A 1 1 
XJCU 


V c&X 

75 


Gly Ala 


Al a 

rVX CI 


nX ct 


Php 

XT 1X6 


v~ jr o 


Cor 

OCX 


riX el 


1*1 C L. 




vax 


Gly Asp 


T .A 1 1 

bell 












RO 










65 












Gly 


Ser 


V clX 


It He 


Leu 
95 


vax 


C A V* 

OCX 


bin 


Leu 


Phe 
100 


Thr 


Phe 


Ser 


Pro 


Arg 
105 


Arg 


His 


VjX u 


i nr 


vax 
110 


LjXIx 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr 


Pro 


taj.y 
120 


His 


Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu 


Leu 


Arg 


He 


Pro 
150 


Gin 


Ala 


Vai 


Val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly 


He 


Leu 
165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Leu 


He 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 









(2) INFORMATION FOR SEQ ID NO: 68 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

20. . (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE :• 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: IND8 



25 



35 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: I 


58 : 






Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


Asn 


Asp 










5 










10 










15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Met 


He 


Met 










20 










25 










30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Phe 


Ser 










35 










40 










45 


Ser 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Ala 










50 










55 










60 


Ser 


Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 










65 










70 










75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 










80 










85 










90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 










95 










100 










105 
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Arg His Glu 
His Val Ser 
Ser Pro Thr 
Gin Ala Val 
Ala Gly Leu 
Leu lie Val 



Thr Val Gin 
110 

Gly His Arg 

125 
Ala Ala Leu 

140 
Val Asp Met 

155 
Ala Tyr Tyr 

170 
Met Leu Leu 

185 



- 126 - 
Asp Cys Asn 
Met Ala Trp 
Val Val Ser 
Val Ala Gly 
Ser Met Val 
Phe Ala Gly 



Cys Ser lie 
115 

Asp Met Met 
130 

Gin Leu Leu 
145 

Ala His Trp 
160 

Gly Asn Trp 
175 

Val Asp Gly 
190 



Tyr Pro Gly 
120 

Met Asn Trp 
135 

Arg lie Pro 
150 

Gly lie Leu 
165 

Ala Lys Val 
180 



10 



15 



20 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

r Asn Asp 
15 

t lie Met 
30 

i Ser Ser 
45 

j Asn Ser 
60 

i Leu Val 

, 03 70 75 

Cys 
90 
Arg 
105 
Gly 
120 
Trp 

30 125 130 135 

Pro 
150 
Leu 
165 
Val 
180 

Leu lie Val M<=1- T,bh Tan DVia 711= m.. 17.1 » ^. -i 

35 



Tyr 


Glu 


Val 


Arg 


Asn 
5 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


10 
Ala 


Ala 


Asp 


His 








20 










25 




Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Arg 








35 










40 






Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Ser 


Val 






50 










55 






Pro 


Thr 


Thr 


Ala 


He 


Arg 


Arg 


His 


Val 


Asp 










65 










70 




Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Gly 








80 










85 




Ser 


Val 


Leu 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Arg 


His 






95 










100 






Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


His 


Val 






110 










115 






Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Ser 








125 








130 






Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Gin 


Ala 






140 










145 






He 


Leu 


Asp 


Val 


Val 


Ala 


Gly Ala 


His 


Trp 


Ala 


Gly 






155 










160 




Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Leu 


He 


Val 




170 










175 




Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Gly 










185 










190 
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(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) . TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE:. 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



20 



Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser Gly Ala 


Tyr 


His 


Val 


Thr 


Asn 


Asp 






5 










10 










15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


val 


He 


Met 








20 








25 










30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Glu 


Gly 


Asn 


Ser 


Ser 








35 










40 










45 


Gin 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Ala 






50 










55 










60 


Thr Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 










65 








70 










75 


Gly 


Ala 


Ala 


Val 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 








80 










85 










90 


Gly 


Ser 


Val 


Phe 


Leu 


He 


Ser 


Gin 


Leu 


Phe 


Thr 


He 


Ser 


Pro 


Arg 








95 










100 










105 


Arg 


His 


Glu 


Thr 


Val 


Gin 


Asn 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 








110 










115 










120 


His 


Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 








130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 
14 0 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu 


Leu 


Arg 


He 


Pro 
150 


Gin 


Ala 


Val 


Met 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly 


Val 


Leu 
165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys 


val 








170 








175 










180 


Leu 


lie 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 









(2) INFORMATION FOR SEQ ID NO: 71: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: homosapiens 



SUBSTITUTE SHEET (RULE 26) . 



WO 96/05315 



PCT7US95/10398 



- 128 



10 



15 



20 



30 



(C) INDIVIDUAL ISOLATE: S45 





<xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO : 


71 : 






Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Ala Tyr 


His 


Val 


Thr 


Asn 


Asp 


Cys 








5 








10 










15 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tvr 


Glu Ala 


Val 


Asp 


Val 


He 


Leu 


His 


Thr 






20 








25 








30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val Aro 




Asn 


Asn 


Ser 


Ser 










35 








40 










45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr T.^*ii 

■X XXX. Jwlw U 




Ala 


Arg 


Asn 


Ser 




Val 






50 








55 








60 


Ser 


Pro 


Thr 


Thr 


Thr 


He 


Arci 


Arg His 


Val 


Asp 


Leu 


Leu 


Val 










65 








70 








75 


.Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr 


Val 


Gly 


Asp 


Leu 


Cys 


Gly 








80 








85 








90 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu Phe 


Thr 


Phe 


Ser 


Pro 


Arg 


Arg 


His 


Glu 




95 








100 










105 


Thr 


Val 


Gin 


Asp 


Cys 


Asn Cys 


Ser 


He 


Tyr 


Pro 


Gly 


His 


Val 


Thr 




110 








115 








120 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met 


Met 


Met 


Asn 


Trp 


Ser 




Thr 




125 








130 










135 


Pro 


Ala 


Ala 


Leu 


Val 


Val 


Ser Gin 


Leu 


Leu 


Arg 


He 


Pro 


Gin 


Ala 






140 








145 




• 




150 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly Ala 


His 


Trp 


Gly 


Val 


Leu 


Ala 


Gly 






155 








160 






165 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly Asn 


Trp 


Ala 


Lys 


Val 


Leu 


He 


Val 




170 








175 






180 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Gly 














185 








190 









(2) INFORMATION FOR SEQ ID NO: 72 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

9 c (C) STRANDEDNESS : unknown 

^ (D) TOPOLOGY: unknown 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 



5 

Cys. Ser Asn Ser Ser 

20 

His Thr Pro Gly Cys 

35 

Arg Cys Trp Val Ala 
35 50 



Gly 


Met 


Tyr 
10 


His 


Val 


Thr 


Asn 


Asp 
15 


Tyr 


Glu 


Ala 
25 


Ala 


Asp 


Met 


He 


Met 
30 


Cys 


Val 


Arg 
40 


Glu 


Asn 


Asn 


Ser 


Ser 
45 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Ser 






55 








60 
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o 



Ser 


Val 


Pro 


1 Xli 


i nr 


Thr 


He 


Arg . 


Arg 


HIS 


vai 


Asp 


Leu 


Leu 


Vdl 




















70 










/ O 


Gly Ala 


/VJ.d 


Ala 


rTie 


Cys 


Ser 


ax a 


Met 


Tyr 


vai 


uiy 


Asp 


Leu 


Cys 










ft n 










85 










q n 


Gly 


Ser 


Val 


DVt a 
xrlle 


Leu 


Val 


Ser 




Leu 


Phe 


Thr 


TDK a, 
rile 


Ser 


Pro 


Arg 




















100 












Arg 


Tyr 


GlU 


m v 

Thr 


Val 


Gin Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


Arg 


Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He 


Pro 










140 










145 










150 


Gin 


Ala 


He 


Val 


Asp 


Met 


Val 


Ala 


Gly Ala 


His 


Trp 


Gly 


Val 


Leu 










155 










160 










165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Leu 


He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Gly 
















185 










190 













(2) INFORMATION FOR SEQ ID NO : 73 : 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

20 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SW2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly Val 


Tyr 


His 


Val 


Thr 


Asn 


Asp 










5 










10 










15 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met 


He 


Met 










20 










25 










30 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Ala 


Asn 


Ser 


Ser 










35 










40 










45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Thr 










50 










55 








60 


Ser 


Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 










65 










70 










75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Val 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 










80 










85 










90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 










95 










100 










105 


Arg 


His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


His 


Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 
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O 

Ser Pro Thr Ala Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 
, 140 145 150 

Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

5 Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 190 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



15 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



Tyr Glu Val Arg Asn Val Ser Gly Val Tyr Tyr Val Thr Asn Asp 

Cys Ser Asn Ser Ser He Val Tyr Glu Thr Ala Asp Met lie Met 

20 25 
2q His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Ser Asn Ser Ser 

Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala 

Ser Val Pro Thr Lys Thr He Arg Arg His Val Asp Leu Leu Val 

Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys 

Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 

95 100 10 c 

Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 

His Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 

Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val LeS 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys ill 

1 70 175 iao 

Leu He Val Leu Leu Leu Phe Ala Gly Val Asp Gly 

185 190 

35 
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20 
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(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : -unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: T10 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


VT/*-\ . ■ 


/ d : 






Tyr 


Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Met 


Tyr 


His 




1 ixi 


7V e? t"i 
noil 












, 5 










10 










15 


Cys 


Ser 


Asn 


Ser 


Ser 


lie 


Val 


Phe 


Glu 


Ala 


Ala 


Asp 


Leu 


Tl ^ 

lie 


PlCL 










20 










25 










3 0 


His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Ser 


Ser 










35 










40 










45 


Arg 


Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn 


Thr 










50 










55 










60 


Ser 


val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 










65 










70 










75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 










80 










85 










90 


Gly 


Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 










95 










100 










105 


Arg 


His 


Glu 


Thr 


Leu 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


His 


Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He 


Pro 










140 










145 








150 


Gin 


Ala 


Val. 


Met 


Asp 


Met 


Val 


Thr 


Gly 


Ala 


His 


Trp 


Gly 


Val 


Leu 










155 










160 










165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Leu 


He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 
















185 










190 













25 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS : 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 

35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:76: 



15 



Tyr 


Glu 


Val 


Arcr 


Asn 


Val 




w x y 


Met- 
ric U 


iyr 


ill S 


vai 


i nr 


Asn 


Asp 


Cys 








5 










10 










1 o 


Ser 


Asn 


Ser 


Ser 


lie 


Val 

V CL X 


Tvrr 


Vjl Li 


nl a 


Ala 
Aid 


Asp 


Met 


lie 


Met 


His 


Thr 






20 


















*5 n 
o U 


Pro 


Glv 


35 


Val 

V CI X 


2r i vj 




Veil 


Arg 

4 n 
u 


blU 


Asn 


Asn 


Ser 


Ser 

A C 

45 


Arg 


Cys 


Trp 


Val 

V SIX. 


Ala 




Thr 
X Xlx 




inr 


Leu 


Ala 


Ala 


Arg 


Asn 


Ala 


Ser 


Val 






50 










55 








bO 


pro 


X 1 IX 


Thr 


1 ilx 


116 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Val 






Ala 




65 










70 








«"7 r- 

75 


Gly Ala 


Thr 




Lys 


C A V* 


Ala 


Met 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 


Gly 








80 










65 








90 


Ser 


Val 


rue 




Tip 

11c 


O A V 


L>m 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro 


Arg 


Gin 


His 


Glu 




95 










100 










105 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 


His 


Val 






110 










115 








120 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 


Ser 




Thr 




125 










130 










135 


Pro 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He 


Pro 


Gin 


Ala 


Val 




140 










145 








150 


Met 


Asp 


Met 


Val 


Ala 


Gly Ala 


His 


Trp 


Gly 


Val 


Leu 


Ala 


Gly 






155 










160 






165 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly Asn 


Trp 


Ala 


Lys 


Val 


Leu 


He 


Val 




170 










175 






180 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Gly 














185 










190 









20 

(2) INFORMATION FOR SEQ ID NO: 77: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: hombsapiens 
(C) INDIVIDUAL ISOLATE: T2 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 77: 






Ala 


Gin 


Val 


Arg 


Asn 


Thr 


Ser 


Arg 


Gly 


Tyr 


Met 


Val Thr 


Asn 


Asp 


Cys 


Ser 






5 










10 








15 


Asn 


Glu 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala Ala 


Val 


Leu 


His 


Val 






20 










25 








30 


Pro 


Gly 


Cys 


He 


Pro 


Cys 


Glu 


Arg 


Leu 


Gly Asn 


Thr 


Ser 


Arg 








35 










40 






45 


Cys 


Trp 


He 


Pro 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val Arg 


Gin 


Pro 










50 










55 






60 



(i) 



25 
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WO 96/05315 
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Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


He 


Asp 


Met 


Val 


Val 










65 










70 










75 


Met 


Ser 


Ala 


Thr 


Leu 
80 


Cvs 


Ser 


Ala 


Leu 


Tvr 
85 


Val 


Gly 


Asp 


Leu 


Cys 
90 


Gly 


Giy 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


Val 


Ser 


Pro 


Arg 






95 










100 










105 


Arg 


HIS 


Trp 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 






110 

M V 










115 










120 


Thr 


lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser 


Pro 


Thr 


Ala 


Thr 
140 


Met 


He 


Leu 


Ala 


Tyr 
145 


Ala 


Met 


Arg 


Val 


Pro 
150 


Glu 


Val 


He 


He 


Asp 


He 


He 


Gly Gly 


Ala 


His 


Trp 


Gly 


Val 


Met 










155 










160 










165 


Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp 


Ala 


Lys 


Val 








170 










175 










180 


He 


Val 


He 


Leu 


Leu 
185 


Leu 


Ala 


Ala 


Gly 


Val 
190 


Asp 


Ala 









(2) INFORMATION FOR SEQ ID NO: 78 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
(C) INDIVIDUAL ISOLATE: T4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78: 



Ala 


Gin 


Val 


Lys 


Asn 


Thr 


Thr 


Asn 


Ser 


Tyr 


Met 


Val 


Thr 


Asn 


Asp 










5 










10 










15 


Cys 


Ser 


Asn 


Asp 


Ser 


lie 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala 


Ala 


Val 


Leu 








20 










25 










30 


His 


Val 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Lys 


Thr 


Gly 


Asn 


Thr 


Ser 










35 










40 










45 


Arg 


Cys 


Trp 


He 


Pro 


Val 


Ser 


Pro 


Asn 


Val 


Ala 


Val 


Arg 


Gin 


Pro 










50 










55 










60 


Gly 


Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


He 


Asp 


Met 


Val 


Val 










65 










70 










75 


Met 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 










80 










85 










90 


Gly Gly Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


Val 


Ser 


Pro 


Gin 










95 










100. 










105 


His 


His 


Trp 


Phe 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


Thr 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Ala 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Met 


Arg 


Val 


Pro 



35 140 145 150 
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30 



- 134 - 

O 

Glu Val He Leu Asp He Val Ser Gly Ala His Tip Gly Val Met 

Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 

170 175 180 

Val Val He Leu Leu Leu Ala Ala Gly Val Asp Ala 

185 190 

5 

(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
•0 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: "79: 

hr Ser Thr Ser Tyr Met Val Thr Asn Asp 

10 15 
Le Thr Trp Gin Leu Gin Ala Ala Val Leu 

25 30 
al Pro Cys Glu Arg Val Gly Asn Ala Ser 

40 45 
al Ser Pro Asn Val Ala Val Gin Arg Pro 

55 60 
Ly Leu Arg Thr His He Asp Met Val Val 

70 75 
's Ser Ala Leu Tyr Val Gly Asp Leu Cys 

85 90 
-a Ala Gin Met Phe He He Ser Pro Gin 

100 105 
.n Glu Cys Asn Cys Ser He Tyr Pro Gly 

H5 12 o 
'g Met Ala Trp Asp Met Met Met Asn Trp 

I 30 135 
st He Leu Ala Tyr Ala Met Arg Val Pro 

e lie Ser Gly Ala His Trp Gly Val Met 
160 165 
ie Ser Met Gin Gly Ala Trp Ala Lys Val 
175 180 
u Thr Ala Gly Val Asp Ala 
185 190 



20 



25 



Ala 


Glu 


Val 


Lys 


Asn 


Cys 


Ser 


Asn 


Asp 


5 

Ser 










20 


His 


Val 


Pro 


Gly 


Cys 










35 


Arg 


Cys 


Trp 


He 


Pro 










50 


Gly Ala 


Leu 


Thr 


Gin 










65 


Met 


Ser 


Ala 


Thr 


Leu 










60 


Gly 


Gly 


Val 


Met 


Leu 


His 








95 


His 


Trp 


Phe 


Val 










110 


Thr 


He 


Thr 


Gly His 










125 


Ser 


Pro 


Thr 


Thr 


Thr 










140 


Glu 


Val 


He 


He 


Asp 


Phe 








155 


Gly 


Leu 


Ala 


Tyr 










170 


Val 


Val 


He 


Leu 


Leu 



(2) INFORMATION FOR SEQ ID NO: 80: 
35 (i > SEQUENCE CHARACTERISTICS 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



10 



15 



20 



25 



- 135 - 



(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS:. unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US10 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 80 : 






Val 


Gin 


Val 


Lys 


Asn 


Thr 


Ser 


Thr 


Ser 


Tyr 


Met 


vai inr 


Asn 


Asp 










5 










10 








15 


Cys 


Ser 


Asn 


Asp 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Glu 


AX a AX 3. 


vai 


Leu 










20 










25 








3 0 


His 


Val 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Lys 


Val 


Gly Asn 


i nr 


Ser 










35 










40 








45 


Arg 


Cys 


Trp 


He 


Pro 


Val 


Ser 


Pro 


Asn 


Val 


Ala 


Val Gin 


Arg 


Pro 










50 










55 








60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


He 


Asp Met 


Val 


Val 










€5 










70 








75 


Met 


Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp 


Phe 


Cys 










80 










85 








90 


Gly 


Gly 


Met 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


Val Ser 


Pro 


Arg 










95 










100 








105 


His 


His 


Ser 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


lie Tyr 


Pro 


Gly 










110 










115 








120 


Thr 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn 


Trp 










125 










130 








135 


Ser 


Pro 


Thr 


Ala 


Thr 


Leu 


He 


Leu 


Ala 


Tyr 


Val 


Met Arg 


Val 


Pro 










140 










145 






150 


Glu 


Val 


He 


He 


Asp 


He 


lie 


Ser 


Gly 


Ala 


His 


Trp Gly 


Val 


Leu 










155 










160 








165 


Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp Ala 


Lys 


Val 










170 










175 








180 


Val 


Val 


He 


Leu 


Leu 


Leu 


Ala 


Ala 


Gly 


Val 


Asp 


Ala 














185 










190 









(2) INFORMATION FOR SEQ ID NO: 81: 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

35 



(i 

30 



BNSDOCID: <WO 9fn5?15^ 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/10398 



10 



15 



- 136 - 

Val Glu Val Arg Asn lie Ser Ser Ser Tyr Tyr Ala Thr Asn Asp 

5 io 15 

Cys Ser Asn Asn Ser lie Thr Trp Gin Leu Thr Asp Ala Val Leu 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu 

35 40 45 

Arg Cys Trp He Gin Val Thr Pro Asn Val Ala Val Lys His Arg 

50 55 60 

Gly Ala Leu Thr His Asn Leu Arg Thr His Val Asp Val He Val 

65 70 75 

Met Ala Ala Thr Val Cys Ser Ala Leu Tyr Val Gly Asp Val Cys 

80 85 90 

Gly Ala Val Met He Val Ser Gin Ala Leu He He Ser Pro Glu 

95 100 105 

Arg Hxs Asn Phe Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly 

110 H5 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Leu Asn Trp 

125 130 135 

Ser Pro Thr Leu Thr Met He Leu Ala Tyr Ala Ala Arg Val Pro 

140 145 150 

Glu Leu Ala Leu Gin Val Val Phe Gly Gly His Trp Gly Val Val 

15 5 160 165 

Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 

170 175 180 

He Ala He Leu Leu Leu Val Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 82: 

20 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



25 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK11 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

lu Val Arg Asn Thr St 
5 



30 C V S Ser f ^ sn Asn Ser 

20 

His Leu Pro Gly Cys 

35 

His Cys Trp He Gin 

50 

Gly Ala Leu Thr His 

65 

35 



Ser 


Ser 


Tyr 
10 


Tyr 


Ala 


Thr 


Asn 


Asp 
15 


Trp 


Gin 


Leu 
25 


Thr 


Asn 


Ala 


Val 


Leu 
30 


Cys 


Glu 


Asn 


Asp 


Asn 


Gly Thr 


Leu 






40 










45 


Pro 


Asn 


Val 
55 


Ala 


Val 


Lys 


His 


Arg 
60 


Arg 


Ala 


His 


He 


Asp 


Met 


He 


Val 






70 








75 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



- 137 - 



Met 


Ala 


AT a 


± XIX 


Val 

vax 




OCX 


XXX C* 


T iPMl 
JJC la 


Tvr 
i y x 


Val 


Glv 


Asp 


Val 


Cys 










ft n 


















90 


Gly Ala 


v dl. 




Tip 
X X C 


VCLl 


C >- 

OCX. 


Gl n 

VJJL 1 1 


Ala 


Phe 


He 


Val 


Ser 


Pro 


Glu 




















100 










105 

.1. V -J 


His 


His 


His 


Phe 


Thr 


Gin 


Glu 


Cvs 


Asn 


Cys 


Ser 


He 


Tvr 


Gin 


Gly 










110 










115 










120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Leu 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Ala 


Arg 


Val 


Pro 










140. 










145 










150 


Glu 


Leu 


Val 


Leu 


Glu 


Val 


Val 


Phe 


Gly 


Gly 


His 


Trp 


Gly 


Val 


Val 










155 










160 










165 


Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly Ala 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


He 


Ala 


He 


Leu 


Leu 


Leu 


Val 


Ala 


Gly 


Val 


Asp 


Ala 









10 185 190 



(2) INFORMATION FOR SEQ ID NO: 83: 
(i) 



15 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 

ID NO: 83 : 

Tyr Ala Thr Asn Asp 

15 

Thr Asn Ala Val Leu 

30 

Asp Asn Gly Thr Leu 

45 

Ala Val Lys His Arg 

60 

Val Asp Met He Val 

75 

Val Gly Asp Met Cys 

90 

He He Ser Pro Glu 

105 

Ser He Tyr Gin Gly 

120 

Met Met Leu Asn Trp 

135 

Ala Ala Arg Val Pro 

150 

His Trp Gly Val Val 

165 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : . unknown 



20 

Val Glu 
Cys Ser 
His Leu 
25 His Cys 
Gly Ala 
Met Ala 
Gly Ala 

30 

Arg His 
Arg He 
Ser Pro 
Glu Leu 



) 


SEQUENCE 


Val 


Arg 


Asn 


He 






5 




Asn 


Ser 


Ser 


He 






20 




Pro 


Gly 


Cys 


Val 






35 




Trp 


He 


Gin 


Val 






50 




Leu 


Thr 


His 


Asn 






65 




Ala 


Thr 


Val 


Cys 






80 




Val 


Met 


He 


Val 






95 




Asn 


Phe 


Thr 


Gin 






110 




Thr 


Gly 


His 


Arg 






125 




Thr 


Leu 


Thr 


Met 






140 




Val 


Leu 


Glu 


Val 



155 



DESCRIPTION: SEQ 



Ser 


Ser 


Ser 


Tyr 








10 


Thr 


Trp 


Gin 


Leu 








25 


Pro 


Cys 


Glu 


Asn 








40 


Thr 


Pro 


Asn 


Val 








55 


Leu 


Arg 


Ala 


His 








70 


Ser 


Ala 


Leu 


Tyr 








85 


Ser 


Gin 


Ala 


Phe 








100 


Glu 


Cys 


Asn 


Cys 








115 


Met 


Ala 


Trp 


Asp 








130 


He 


Leu 


Ala 


Tyr 








145 


Val 


Phe 


Gly Gly 








160 



BNSDOClD- <-WO OROSTi^a-s i«> 



SUBSTITUTE SHEET (RULE 26) 
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O 

Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 

170 175 180 

lie Ala lie Leu Leu Leu Val Ala Gly Val Asp Ala 

185 190 

5 (2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T8 



15 



20 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 



Val 


Glu 


Val 


Arg 


Asn 


Thr 


Ser 


Phe 


Ser 


Tyr 


Cys 








5 










10 


Ser 


Asn 


Asn 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


His 








20 








25 


Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Asn 


Arg 








35 








40 


Cys 


Trp 


He 


Gin 
50 


Val 


Thr 


Pro 


Asn 


Val 
55 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Thr 


His 




Ala 






65 








70 


Met 


Ala 


Thr 


Val 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Gly Ala 


Val 


Met 


He 


Ala 


Ser 


Gin 


Ala 


Phe 


Arg 


His 






95 










100 


Asn 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


His 


He 






110 










115 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 










125 








130 


Ser 


Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Glu 








140 










145 


Leu 


Val 


Leu 


Glu 


Val 


Val 


Phe 


Gly 


Gly 


Phe 








155 








160 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


He 


Ala 






170 










175 


He 


Leu 


Leu 


Leu 


Val 


Ala 


Gly Val 










185 










190 



15 

tir Asn Ala Val Leu 

30 

sp Asn Gly Thr Leu 

45 

La Val Lys His Arg 

60 

il Asp Val He Val 

75 

il Gly Asp Val Cys 

90 

Le He Ser Pro Glu 

105 

ir He Tyr Gin Gly 

?s ... , XAU H5 120 

it Met Leu Asn Trp 

135 

-a Ala Arg Val Pro 

150 

-s Trp Gly Val Val 

165 

-a Trp Ala Lys Val 

180 



(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 



35 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/10398 



10 



15 



20 



25 



35 



139 - 



(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


V4\J . i 


DC . 
DDI 






vai 


Glu 


Val 


Lys 


Asp 


Thr 


Gly 


Asp 


Ser 


Tyr 


Met 


Pro 


Tnr 


Asn 


Asp 










5 










10 










1 3 


Cys 


Ser 


Asn 


Ser 


Ser 


He 


Val 


Trp 


Gin 


Leu 


Glu 






vai 


Leu 










20 










25 












HIS 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Arg 


Thr 


ax a 


Asn 


vai 


Ser 










35 










40 












Arg 


Cys 


Trp 


Val 


Pro 


Val 


Ala 


Pro 


Asn 


Leu 


Ala 


lie 


Ser 


Gin 


Pro 










50 










55 










60 


Gly 


Ala 


Leu 


Thr 


Lys 


Gly 


Leu 


Arg 


Ala 


His 


He 


Asp 


He 


He 


Val 










65 










70 










75 


Met 


Ser 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Val 


Cys 










80 










85 










90 


Gly 


Ala 


Leu 


Met 


Leu 


Ala 


Ala 


Gin 


Val 


Val 


Val 


Val 


Ser 


Pro 


Gin 










95 










100 










105 


His 


His 


Thr 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro 


Gly 










110 










115 










120 


Arg 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Thr 


Met 


Leu 


Leu 


Ala 


Tyr 


Leu 


Val 


Arg 


lie 


Pro 










140 










145 










150 


Glu 


Val 


lie 


Leu 


Asp 


He 


Val 


Thr 


Gly 


Gly 


His 


Trp 


Gly 


Val 


Met 










155 










160 










165 


Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ser 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


He 


Val 


He 


Leu 


Leu 


Leu 


Thr 


Ala 


Gly 


Val 


Glu 


Ala 
















185 










190 













(2) INFORMATION FOR SEQ ID NO: 86: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
30 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



Leu 


Glu 


Trp 


Arg 


Asn 


Val 


Ser 


Cys 


Ser 


Asn 


Ser 


5 

Ser 


lie 


Val 










20 






His 


Thr 


Pro 


Gly 


Cvs 


Val 












35 






Thr 


Cys 


Trp 


Thr 


Ser 


Val 


Thr 










— ? \j 






Gly 


Ala 


Thr 


Thr 


Ala 


Ser 


He 

± X c 










.65 






Gly Ala 


Ala 


Thr 


Met 


Cys 


Cat* 

OCX 










80 






Gly Ala 


Val 


Phe 


Leu 


Val 


Glv 
wjl y 










95 




Arg 


His 


Gin 


Thr 


Val 


Gin 


l nr 










110 






His 


Leu 


Ser 


Gly 


His 


Arg 


Met 










125 






Ser 


Pro 


Ala 


Val 


Gly Met 


Val 










140 






Gin 


Thr 


Leu 


Phe 


Asp 


He 


He 










155 






Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 










170 






Ala 


He 


He 


Met 


Val 


Met 


Phe 










185 







140 - 



Gly 


Leu 


Tvr 


Val 

V G& X 


Leu 


Thr 


Asn 


Asp 
15 


Tvr 


Glu 


Ala 




Asp 


Val 


He 


Leu 


Cys 














30 


Val 


Gl n 

will 




Gly Asn 


Thr 


Ser 






U 










45 


Pro 


Thr 

X 11X 


val 


/I j. a 


Val 


Arg 


Tyr 


Val 














60 


Arg 




rix s 


vai 


Asp 


Leu 


Leu 


Val 


Ala 




70 










75 


Leu 


Tyr 


vai 


Gly Asp Val 


Cys 


Gin 




85 










90 


Ala 


Phe 


Thr 


Phe 


Arg 


Pro 


Arg 


Cys 




100 










105 


Asn 


Cys 


Ser 


Leu 


Tyr 


Pro 


Gly 


Ala 




115 










120 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 


Val 




130 










135 


Ala 


His 


Val 


Leu 


Arg 


Leu 


Pro 


Ala 




145 








150 


Gly Ala 


His 


Trp 


Gly 


He 


Met 


Met 


Gin 


160 








165 


Gly Asn 


Trp 


Ala 


Lys 


Val 






175 








180 


Ser 


Gly 


Val 
190 


Asp 


Ala 







(2) INFORMATION FOR SEQ ID NO: 87: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
(D> TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

95 < A > ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK10 

SEQUENCE DESCRIPTION: SEQ ID NO.-87: 

sr Gly Leu Tyr Val Leu Thr Asn Asp 
10 !5 
al Ty*" Glu Ala Asp Asp Val He Leu 
30 „, _ 20 25 30 

ro Cys Val Gin Asp Gly Asn Thr Ser 
40 45 
lr Pro Thr Val Ala Val Arg Tyr Val 

55 60 
Le Arg Ser His Val Asp Leu Leu Val 

— ~ — « wet <jys ser Ala Leu Tyr Val Gly Asp Met Cys 

85 90 



(xi) 


SEQUENCE 


Leu Glu 


Trp. 


Arg 


Asn 


Val 


Cys Pro 


Asn 


Ser 


5 

Ser 


He 








20 




His Thr 


Pro 


Gly 


Cys 


Val 








35 




Thr Cys 


Trp 


Thr 


Ser 


Val 








50 




Gly Ala 


Thr 


Thr 


Ala 


Ser 








65 




Gly Ala 


Ala 


Thr 


Met 


Cys 








BO 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US9S/10398 



141 - 



10 



20 



25 



30 



Gly 


Ala 


Val 


Pne 


Leu 
95 


val 


Gly 


Gin 


Ala 


Phe 
100 


Thr 


Pne 


Arg 


Pro 


Arg 
105 




HIS 


v^in 


i nr 


vai 
110 


(jin 


inr 


Cys 


Asn 


Cys 
115 


Ser 


Leu 


Tyr 


Pro 


Gly 
120 


His 


Leu 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Ala 


Val 


Gly 
14 0 


Met 


Val 


Val 


Ala 


His 
145 


Val 


Leu 


Arg 


Leu 


Pro 
150 


Gin 


Thr 


Leu 


Phe 


Asp 
155 


He 


He 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly 


He 


Leu 
165 


Ala 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gin 


Gly Asn 


Trp 


Ala 


Lys 


Val 










170 










175 










180 


Ala 


He 


He 


Met 


Val 
185 


Met 


Phe 


Ser 


Gly 


Val 
190 


Asp 


Ala 









(2) INFORMATION FOR SEQ ID NO:88 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : - unknown 
15 (D) TOPOLOGY: unknown 

(vi) . ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88 



35 



Leu Glu 


Trp 


Arg 


Asn 


Thr 


Ser 


Gly 


Leu 


Tyr 


Val 


Leu 


Thr 


Asn 


Asp 








5 










10 










15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val 


He 


Leu 








20 










25 










30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn 


Thr 


Ser 








35 










40 










45 


Thr Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg 


Tyr 


Val 








50 










55 








60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser 


His 


Val 


Asp 


Leu 


Leu 


Val 








65 










70 








75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Met 


Cys 








80 










85 










90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg 


Pro 


Arg 








, 95 










100 










105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr 


Pro 


Gly 








110 










115 










120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser Pro 


Ala 


Val 


Gly 


Met 


Val 


Val 


Ala 


His 


Val 


Leu 


Arg 


Leu 


Pro 








14 0 










145 










150 


Gin Thr 


Val 


Phe 


Asp 


He 


He 


Ala 


Gly Ala 


His 


Trp 


Gly 


He 


Leu 








155 










160 










165 
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Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Tip Ala Lys Val 

170 175 180 

Ala lie He Met Val Met Phe Ser Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 89: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 



15 



20 



25 



30 



Leu 


Glu 


Trp 


Arg 


Asn 


Thr 


Ser 


Gly 


Leu 


Tyr 


Val 


Leu 


Thr 


Asn 


Asp 


Cys 


Ser 






5 










10 










15 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val 


He 


Leu 


His 


Thr 






20 










25 








30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn 


Thr 


Ser 


Met 








35 










40 








45 


Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg 


Tyr 


Val 


Gly 


Ala 






50 










55 






60 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser 


His 


Val 


Asp 


Leu 


Leu 


Val 










65 










70 








75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Met 


Cys 










80 










85 








90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg 


Pro 


Arg 


Arg 


His 






95 










100 








105 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr 


Pro 


Gly 


His 


Val 






110 










115 








120 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 


Ser 


Pro 


Ala 




125 










130 










135 


Val 


Gly 


Met 


Val 


Val 


Ala 


His 


He 


Leu 


Arg 


Leu 


Pro 


Gin 


Thr 






140 










145 








150 


Leu 


Phe 


Asp 


He 


Leu 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


He 


Leu 


Ala 


Gly 






155 










160 






165 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gin 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 


Ala 


He 


Val 




170 










175 






180 


Met 


He 


Met 


Phe 


Ser 


Gly 


Val 


Asp 


Ala 














185 










190 











(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 



SUBSTITUTE SHEET (RULE 26) 
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(D) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 



Leu Glu 


Trp 


Arg 


Asn 


i nr 


Ser 


vaiy 


Leu 


Tyr 


lie 


Leu 


Thr 


A ^ T1 


nSp 








c 
b 










J. u 












Cys Ser 


Asn 


Ser 


Ser 


lie 


vai 


Tyr 


(jlU 


Aia 


Asp 


Asp 


Veil 


Xlc 


Leu 








z u 






















His Tnr 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


vai 


Gin 


Asp 


(jiy 


Asn 


j. nr 


Ser 


















4 0 










45 


Thr Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg 


Tyr 


Val 






50 










55 










60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


lie 


Arg 


Ser 


His 


Val 


Asp 


Leu 


Leu 


Val 






65 








70 










75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Met 


Cys 








80 










85 










90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg 


Pro 


Arg 








95 










100 










105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr 


Pro 


Gly 








110 










115 










120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser Pro 


Ala 


Val 


Gly 


Met 


Val 


Val 


Ala 


His 


He 


Leu 


Arg 


Leu 


Pro 








140 










145 










150 


Gin Thr 


Leu 


Phe 


Asp 


He 


Leu 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


He 


Leu 








155 










160 










165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gin 


Gly 


Asn 


Trp 


Ala 


Lys 


Val 








170 










175 










180 


Ala He 


He 


Met 


He 


Met 


Phe 


Ser 


Gly 


Val 


Asp 


Ala 









185 190 



(2) INFORMATION FOR SEQ ID NO: 91: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 



35 



Glu His Tyr Arg Asn Ala Ser Gly lie Tyr His lie Thr Asn Asp 

5 10 15 
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Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp His His He Leu 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Val Met Thr Gly Asn Thr Ser 

Arg Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Ala His Pro 

50 55 60 

Gly Ala Pro Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val 

5 65 70 75 

Gly Ala Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys 

80 85 on 

Gly Gly Ala Phe Leu Met Gly Gin Met lie Thr Phe Arg Pro Arg 

Arg His Trp Thr Thr Gin Glu Cys Asn Cys Ser He Tyr Thr Gly 

^ His He Thr Gly His Arg Met Ala Trp As| Met Met Met Asn Tr^ 

Ser Pro Thr Thr Thr Leu Leu Leu Ala Gin He Met Arg Val Pro 

Thr Ala Phe Leu Asp Met Val Ala Gly Gly His Trp Gly Val llu 

Ala Gly Leu Ala Tyr Phe Ser Met Gin Gly Asn Trp Ala Lys Val 

170 175 . fin 

I5 Val Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 

185 190 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 



25 



35 



(xi) 


SEQUENCE 


Val His 


Tyr 


Arg 


Asn 


Ala 


Cys Pro 


Asn 


Thr 


5 

Ser 


He 


His Leu 






20 




Pro 


Gly 


Cys 


Val 








35 




Arg Cys 


Trp 


Val 


Pro 


Leu 








50 




Asn Ala 


Pro 


Leu 


Glu 


Ser 


Gly Ala 






65 




Ala 


Thr 


Met 


Cys 








80 


Gly Gly 


Val : 


Phe 


Leu 


Val 








95 





SEQUENCE DESCRIPTION: SEQ ID NO: 92 



10 

al Tyr Glu Thr 
25 

ro Cys Val Arg 

30 _ _ _ 35 40 

ir Pro Thr Val 
55 

it Arg Arg His 
70 

ir Ala Phe Tyr 
85 

.y Gin Leu Phe 
100 



NO: 


92 : 






Val 


Thr 


Asn 


Asp 








15 


His 


His 


He 


Met 








30 


Glu 


Asn 


Thr 


Ser 








45 


Ala 


Pro 


Tyr 


Pro 








60 


Asp 


Leu 


Met 


Val 








75 


Gly 


Asp 


Leu 


Cys 








90 


Phe 


Arg 


Pro 


Arg 








105 
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5 



/u. y 


XJ -i e 

ill B 


ir P 


X XIX 


X 11X\ 

110 


WliJ 


nop 




noil 


115 


Cat* 


Tip 

lie 


Tyr 


Pro 


120 


His 


Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Ser 


Ala 
140 


Leu 


He 


Met 


Ala 


Gin 
145 


He 


Leu 


Arg 


He 


Pro 
150 


Ser 


lie 


Leu 


Gly 


Asp 


Leu 


Leu 


Thr 


Gly 


Gly 


His 


Trp 


Gly Val 


Leu 










155 










160 










165 


Ala 


Gly 


Leu 


Ala 


Phe 
170 


Phe 


Ser 


Met 


Gin 


Ser 
175 


Asn 


Trp 


Ala 


Lys 


Val 
180 


He 


Leu 


Val 


Leu 


Phe 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Glu 


Gly 









10 (2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 



25 



Val Asn 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


Asn 


Asp 








5 










10 










15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Glu 


His 


Gin 


He 


Leu 








20 










25 










30 


His Leu 


Pro 


Gly 


Cys 


Leu 


Pro 


Cys 


Val 


Arg 


Val 


Gly 


Asn 


Gin 


Ser 








35 










40 










45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Ser 


Tyr 


He 








50 










55 










60 


Gly Ala 


Pro 


Leu 


Asp 


Ser 


Leu 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Met 


Val 








65 










70 










75 


Gly Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu 


Cys 








80 










85 










90 


Gly Gly Ala 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Phe 


Gin 


Pro 


Arg 








95 










100 










105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ala 


Gly 








110 










115 










120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Leu 


Leu 


Ala 


Gin 


Val 


Met 


Arg 


He 


Pro 








140 










145 








150 


Ser Thr 


Leu 


Val 


Asp 


Leu 


Leu 


Ala 


Gly 


Gly 


His 


Trp 


Gly 


Val 


Leu 








155 










160 










165 


Val Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 



170 175 180 

35 
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° lie Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 

185 190- 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
5 (B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 



25 



Val Asn 


Tyr 


His 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


He 


Thr 


Asn 


Asp 


Cys Pro 






5 










10 










15 


Asn 


Ser 


Ser 


He 


Met 


Tyr 


Glu 


Ala 


Glu 


His 


His 


He 


Leu 


His Leu 






20 










25 










30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Gin 


Ser 








35 










40 








45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Ala 


Pro 


Tyr 


He 


Gly Ala 






50 










55 








60 


Pro 


Leu 


Glu 


Ser 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Met 


Val 


Gly Ala 






65 










70 








75 


Ala 


Thr 


val 


Cys 


Ser 


Ala 


Leu 


Tyr 


He 


Gly 


Asp 


Leu 


Cys 


Gly Gly 






80 










85 








90 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Phe 


Gin 


Pro 


Arg 


Arg His 






95 










100 










105 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ala 


Gly 


His Val 






110 










115 








120 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 


Ser Pro 


Thr 




12 5 










130 










135 


Thr 


Thr 


Leu 


Val 


Leu 


Ala 


Gin 


Val 


Met 


Arg 


He 


Pro 


Ser Thr 






140 










145 








150 


Leu 


Val 


Asp 


Leu 


Leu 


Thr 


Gly Gly 


His 


Trp 


Gly 


He 


Leu 


He Gly 


Val 




155 










160 






165 


Ala 


Tyr 


Phe 


Cys 


Met 


Gin 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 


He Leu 






170 










175 






180 


Val 


Leu 


Phe 


Leu 


Tyr 


Ala 


Gly Val 


Asp 


Ala 












185 










190 











30 (2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

35 (vi) ORIGINAL SOURCE: 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 
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10 



15 



30 



- 147 - 



(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 





(xi) 


SEQUENCE 


DESCRIPTION: 


SEQ ID 


NO: 95 : 






Tyr 


Asn 


Tyr 


Arg 


Asn 


Ser 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


R en 


Act*) 




5 










10 










15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Asp 


Tyr 


His 


Tip 
lie 










20 










25 












His 


Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


T ,\/c 










35 










40 










45 


Thr 


Cys 


Trp 


Val 


Ser 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Ala 


Gin 


ni o 








50 










55 










60 


Asn 


Ala 


Pro 


Leu 


Glu 


Ser 


Leu 


Arg 


Arg 


His 


Val 


Asp 


Leu 


rlC L. 












65 










70 










/ -mi 


Gly 


Gly 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


He 


Gly 


Asp 


v d J- 








80 










85 










90 


Gly 


Gly 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Gin 


Pro 


Arg 






95 










100 










105 


Arg 


His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Thr 


Gly 








110 










115 










120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser 


Pro 


Thr 


Ala 


Thr 


Leu 


Val 


Leu 


Ala 


Gin 


Leu 


Met 


Arg 


lie 


Pro 










14 0 










145 










150 


Gly Ala 


Met 


Val 


Asp 


Leu 


Leu 


Ala 


Gly 


Gly 


His 


Trp 


Gly 


He 


Leu 










155 










160 










165 


Val 


Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 








170 










175 










180 


He 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 
















185 










190 













20 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 



Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Ser Leu He Leu 

20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Arg Gin Asp Asn Val Ser 
35 35 40 45 



SUBSTITUTE SHEET (RULE 26) 
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10 



15 



20 



25 



30 



Arg Cys 


Trp 


val 




lie 


Thr 


Pro 


Tnr 


Leu 


Ser 


Ala 


Pro 


Thr 


Phe 


Gly Ala 






50 




















60 


Val 


i nr 


ax a 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr 


Leu 


Ala 


Gly Gly 


Ala 




65 










70 






75 


/ix a 


Leu. 




Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Ala 


Cys 


Gly Ala 


val 




80 










85 








90 


-file 


Leu 


vai 


Gly Gin 


Met 


Phe 


Thr 


Tyr 


Arg 


Pro 


Arg 


Gin His 






95 










100 








105 


Thr 

111 


Thr 


V CiJ. 




Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ser 


Gly 


His He 


Thr 




110 










115 








120 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 


Ser Pro 


Thr 




125 










130 










135 


Thr 


Ala 


Leu 


Leu 


Met 


Ala 


Gin 


Met 


Leu 


Arg 


He 


Pro 


Gin Val 


Val 




140 










145 








150 


He 


Asp 


lie 


He 


Ala 


Gly 


Gly His 


Trp 


Gly 


Val 


Leu 


Phe Ala 


Ala 




155 










160 






165 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 


Val Leu 






170 










175 






180 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 












185 










190 









(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 

(xi) 



Val Pro 


Tyr 


Arg 


Cys Pro 


Asn 


Ser 


His Ala 


Pro 


Gly 


Lys Cys 


Trp 


Val 


Gly Ala 


Val 


Thr 


Gly Gly 


Ala 


Ala 


Gly Ala 


Val 


Phe 


Gin His 


Thr 


Thr 



SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 


97 : 




Asn 
5 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


Asn 


Ser 


He 


Val 


Tyr 


Glu 


10 
Ala 


Asp 


Asn 


Leu 


He 


20 










25 








Cys 


Val 


Pro 


Cys 


Val 


Arg 


Gin 


Asp 


Asn 


Val 


35 










40 








Gin 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro 


Asn 


50 










55 








Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr 


Leu 


65 










70 






Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Ala 


60 










85 






Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg 


Pro 


95 










100 






Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ser 



35 



15 
Leu 
30 
Ser 
45 
Leu 
60 
Ala 
75 
Cys 
90 
Arg 
105 

Gly 

110 115 120 
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15 



20 



25 



30 



His 


He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp Met 
130 


Met 


Met Asn 


Trp 
135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Leu 


Met 


Ala 


Gin Leu 


Leu 


Arg He 


Pro 










14 0 










145 




150 


Gin 


Val 


Val 


He 


Asp 
155 


He 


He 


Ala 


Gly 


Gly His 
160 


Trp 


Gly Val 


Leu 
165 


Phe 


Ala 


Ala 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala Asn 


Trp 


Ala Lys 


Val 










170 










175 




180 


lie 


Leu 


Val 


Leu 


Phe 
185 


Leu 


Phe 


Ala 


Gly 


Val Asp 
190 


Ala 







(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98 



Val 


Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


Asn 


Asp 










5 










10 










15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asn 


Leu 


He 


Leu 










20 










25 








30 


His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Lys 


Glu 


Gly 


Asn 


Val 


Ser 










35 










40 










45 


Arg 


Cys 


Trp 


Val 


Gin 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro 


Asn 


Leu 










50 










55 










60 


Gly 


Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Val 


Val 


Asp 


Tyr 


Leu 


Ala 










65 










70 








75 


Gly 


Gly Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Ala 


Cys 










80 










85 








90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg 


Pro 


Arg 










95 










100 










105 


Gin 


His 


Thr 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ser 


Gly 










110 










115 










120 


His 


lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Val 


Leu 


Arg 


He 


Pro 










14 0 










145 








150 


Gin 


Val 


Val 


He 


Asp 


He 


He 


Ala 


Gly Gly 


His 


Trp 


Gly 


Val 


Leu 










155 










160 










165 


Phe 


Ala 


Val 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 










170 










175 






180 


Val 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Gly 
















185 










190 









35 



SUBSTITUTE SHEET (RULE 26) 
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25 



- 150 - 

(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99 

Val Pro Tyr Arg Asn 
10 5 
Cys Pro Asn Ser Ser 

20 

His Ala Pro Gly Cys 

35 

Arg Cys Trp Val His 

50 

Gly Ala Val Thr Ala 

65 

Gly Gly Ala Ala Leu 

80 

Gly Ala Leu Phe Leu 

95 

Gin His Ala Thr Val 

110 

His lie Thr Gly His 

125 

Ser Pro Ala Thr Ala 

140 

Gin Val Val He Asp 

155 

Phe Ala Ala Ala Tyr 

170 

Val Leu Val Leu Phe 

185 " 190 



15 



20 



Ala 


Car 
OCX 




Val 


Tyr 


His 


Val 


Thr 


Asn 


Asp 


He 








10 










15 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Leu 


He 


Leu 


Val 








25 










30 


Pro 


Cys 


Val 


Arg 


Lys 


Asp 


Asn 


Val 


Ser 


He 


Thr 






40 










45 


Pro 


Thr 


Leu 
55 


Ser 


Ala 


Pro 


Ser 


Leu 
60 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr 


Leu 


Ala 










70 








75 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Val 


Cys 


Val 








85 










90 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg 


Pro 


Arg 


Gin 








100 










105 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr 


Ser 


Gly 
120 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Leu 


Val 


Met 


Ala 


Gin 


Met 


Leu 


Arg 


He 


Pro 


He 








145 








150 


He 


Ala 


Gly Gly His 


Trp 


Gly 


Val 


Leu 


Phe 








160 










165 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 










175 








180 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 









(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
35 (C) INDIVIDUAL ISOLATE: SA7 



SUBSTITUTE SHEET (RULE 26) 
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10 



15 



20 



25 



30 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


MO . 


inn. 






Val 


Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


Asn 


Asp 










5 










10 










ID 


Cys 


Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp 


Asn 


Leu 


He 


T a* * 

Leu 










20 










25 










J U 


His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Gin 


Asn 


Asn 


Val 


Ser 










35 










40 










A C 


Arg 


Cys 


Trp 


Val 


Gin 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro 


Asn 


Leu 










50 










55 










50 


Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr 


Leu 


Ala 










65 










70 










/D 


Gly 


Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp Ala 


Cys 










80 










85 










90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Tyr 


Arg 


Pro 


Arg 










95 










100 










105 


Gin 


His 


Thr 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Ser 


Gly 










110 










115 










120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 










125 










130 










135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro 










140 










145 








150 


Gin 


Val 


Val 


He 


Asp 


He 


He 


Ala 


Gly 


Gly 


His 


Trp 


Gly 


Val 


Leu 










155 










160 








165 


Phe 


Ala 


Ala 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 










170 










175 






180 


Val 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 
















185 










190 











(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101 



35 



Val 


Pro 


Tyr 


Arg 


Asn 
5 


Ala 


Ser 


Gly Val 


Tyr 
10 


His 


Val Thr 


Asn 


Asp 
15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr Glu 


Ala 


Asp 


Asp Leu 


He 


Leu 










20 








25 




30 


His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys Val 


Arg 


Gin 


Gly Asn 


V&l 


Ser 










35 








40 






45 


Arg 


Cys 


Trp 


Val 


Gin 
50 


He 


Thr 


Pro Thr 


Leu 
55 


Ser 


Ala Pro 


Ser 


Leu 
60 
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Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 






Ala 


Val 


nop 


Tyr 


Leu 


/ix a 








65 










/ u 








75 


Gly Gly Ala 


Ala 


Leu 


Cvs 


Ser 


Ala 


Leu 


Tvr 


Val 


Gly 


Asp 


Ala 




biy Aid 






80 




















90 


val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Ser 


Pro 


Arg 


Arg His 






95 




















105 


Asn 


Val 


Val 


Gin 


Asp 


w jr o 


Ron 


fS/C 


Ser 


Tin 

115 


Tyr 


Ser 


Cjxy 


His He 


Thr 




110 










1 1 c 
X X 3 










Gly His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn 


Trp 








125 










130 










135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu 


Arg 


He 


Pro 


Gin Val 






140 










145 








150 


Val 


He 


Asp 


He 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val 


Leu 


Phe Ala 






155 










160 








165 


Ala 


Ala 


Tyr 


Tyr 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp 


Ala 


Lys 


Val 


Val Leu 






170 










175 






180 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 












185 










190 











(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: HK2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 



Leu 


Thr 


Tyr 


Gin 


Asn 


Ser 


Ser 


Gin 


Leu 


Tyr 


His 


Leu 


Thr 


Asn 


Asp 


Cys 


Pro 






1 










10 










15 


Asn 


Ser 


Ser 


He 


Val 


Leu 


Glu 


Ala 


Asp 


Ala 


Met 


He 


Leu 


His 


Leu 






20 










25 








30 


Pro 


Gin 


Cys 


Leu 


Pro 


Cys 


Val 


Arg 


Val 


Asp 


Asp 


Arg 


Ser 


Thr 








35 










40 






45 


Cys 


Trp 


His 


Ala 


Val 


Thr 


Pro 


Thr 


Leu 


Ala 


He 


Pro 


Asn 


Ala 


Ser 


Thr 






50 










55 










60 


Pro 


Ala 


Thr 


Gin 


Phe 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu 


Ala 


Gin 


Ala 


Ala 




65 










70 








75 


Val 


Val 


Cys 


Ser 


Ser 


Leu 


Tyr 


He 


Gin 


Asp 


Leu 


Cys 


Gin 








80 










85 








90 


Ser 


Leu 


Phe 


Leu 


Ala 


Gin 


Gin 


Leu 


Phe 


Thr 


Phe 


Gin 


Pro 


Arg 


Arg 


His 






95 










100 










105 


Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Thr 


Gin 


His 


Val 






110 










115 








120 


Thr 


Gin 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met 


Asn 


Trp 
135 


Ser 


Pro 


Thr 


Thr 


Thr 


Leu 


Val 


Leu 


Ser 


Ser 


He 


Leu 


Arg 


Val 


Pro 










140 










145 






150 
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o 

Glu lie Cys Ala Ser Val lie Phe Gin Gin His Trp Gin lie Leu 

155 160 165 

Leu Ala Val Ala Tyr Phe Gin Met Ala Gin Asn Trp Leu Lys Val 

170 175 180 

Leu Ala Val Leu Phe Leu Phe Ala Gin Val Glu Ala 

185 190 

5 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
10 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 



15 atr acjc irn a at rrr a a a ncr ns a/rsa a&n ti/-t* /-v^t 3 g 

78 
117 
156 
195 
234 
273 

20 GGG TGG GCG GGA TGG CTC CTG TCT CCC CGT GGC TCT CGG 312 

351 
390 
429 
468 
507 
54 6 
573 



25 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 1 


ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


CCG 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


CCC 


AAG 


GCA 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCT 


CCC 


CGT 


GGC 


TCT 


CGG 


CCT 


AGC 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGC 


AGG 


TCG 


CGC 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATA 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


CTT 


GGA 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCC 


CTG 


CTC 


TCT 


TGC 


CTG 


ACC 


GTG 


CCC 


GCT 


TCG 


GCC 











(2) INFORMATION FOR SEQ ID NO: 104: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US11 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104 



SUBSTITUTE SHEET (RULE 26) . 



WO 96/05315 



PCT/US95/10398 
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ATG AGC ACG AAT GCT AAA CGT CAA 
AAC ACC AAC CGT CGC CCA CAG GAC 
GGC GGT CAG ATC GTT GGT GGA GTT 
AGG GGC CCT AGA TTG GGT GTG CGC 
TCC GAG CGG TCG CAA CCT CGA GGT 
CCC AAG GCA CGT CGG CCC GAG GGC 
- CCC GGG TAC CCT TGG CCC CTC TAT 
GGG TGG GCG GGA TGG CTC CTG TCT 
CCT AGC TGG GGC CCC ACG GAC CCC 
AAT TTG GGT AAG GTC ATC GAT ACC 
GCC GAC CTC ATG GGG TAC ATA CCG 
CTC GGA GGC GCT GCC AGG GCC CTG 
GTT CTG GAA GAC GGC GTG AAC TAT 
CCT GGT TGC TCT TTC TCT ATC TTC 
10 TCT TGC CTG ACT GTG CCC GCT TCA 



AGA AAA ACC AAA CGT 3 9 

GTC AAG TTC CCG GGT 78 

TAC TTG TTG CCG CGC 117 

GCG ACG AGG AAG ACT 156 

AGA CGT CAG CCT ATC 195 

AGG ACC TGG GCT CAG 234 

GGC AAT GAG GGC TGC 2 73 

CCC CGT GGC TCT CGG 312 

CGG CGT AGG TCG CGC 3 51 

CTT ACG TGC GGC TTC 3 90 

CTC GTC GGC GCC CCT 429 

GCG CAT GGC GTC CGG 4 68 

GCA ACA GGG AAC CTT 507 

CTT CTG GCC CTG CTC 54 6 

GCC 573 



15 



(2) INFORMATION FOR SEQ ID NO: 105: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

:C AAA CGT 39 

TC CCG GGT 78 

VG CCG CGC 117 

3G AAG ACT 156 

\G CCT ATC 195 

7S "I **~ w ^ w AW AUC TGG GCT CAG 234 

\G GGC TGC 273 

3C TCT CGG 312 

JG TCG CGC 351 

5C GGC TTC 3 90 

5C GCC CCC 429 

JC GTC CGG 468 



30 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CCC 


AAG 


GCA 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


CCC 


GGG 


TAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCT 


CCC 


CGT 


CCT 


AGC 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATA 


CCG 


CTC 


GTC 


CTC 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


CTA 


TCT 


TGC 


CTG 


ACT 


GTG 


CCC 


GCT 


TCA 


GCC 



507 
54 6 
573 



(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 573 base pairs 

(B) TYPE: nucleic acid 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 
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155 - 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW1 



39 
78 
117 
156 
195 

10 CCC AAG GCG CGT CGG CCC GAG GGC AGG ACC TGG GCT CAG 234 

273 
312 
351 
390 
429 
468 

™ .„ . _ 507 

u r*r"r nrzT Tnr* tpt ttp tpt rtp ttp ott cvn nrr* r"rn rrr 54 6 

573 







(xi) 




SEQUENCE DESCRIPTION: . SEQ ! 


ID NO: 1 


ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


GGA 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGT 


GGC 


TCT 


CGG 


CCT 


AGC 


TGG 


GGC 


CCT 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


CTT 


GGA 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


CTG 


GCC 


CTG 


CTT 


TCT 


TGC 


CTG 


ACA 


GTG 


CCC 


GCG 


TCA 


GCC 











(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S18 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 



39 
78 
117 
156 

30 . TCC GAG CGG TCG CAA CCT CGC GGT AGA CGT CAG CCT ATC 195 

234 
273 
312 
351 
390 
429 
468 

35 GTT CTG GAA GAC GGC GTG AAC TAT GCA ACA GGG AAC CTT 507 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGC 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGT 


GGC 


TCC 


CGG 


CCT 


AGC 


TGG 


GGC 


CCT 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


AAT 


TTG 


GGC 


AAA 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCT 


CTC 


GGA 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


CTT 



BNSDOCIO' <rWO _ 9605315*3 |A> 
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CCT GGT TGC TCT TTC TCT ATC TTC CTT CTG GCC CTG CTC 
TCT TGT CTG ACT GTG CCC GCG TCA GCT 



(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : s ingl e 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 







(xi) 




SEQUENCE DESCRIPTION: SEQ 


ID NO: 1 


ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


GGC 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGA 


GGT 


AGA 


CGT 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGT 


CGG 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGT 


GGC 


TCT 


CGG 


CCT 


AGC 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAC 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


CTT 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGA 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCT 


TTG 


CTC 


TCT 


TGC 


TTG 


ACC 


GTG 


CCC 


GCA 


TCG 


GCC 









(2) INFORMATION FOR SEQ ID NO : 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SA10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 
AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 
GGT GGT CAG ATC GTT GGT GGA GTC TAT CTG TTG CCG CGC 
AGG GGC CCC AGG TTG GGT GTG CGC GCG ACG AGG AAG ACT 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT7US95/10398 
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5 



TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


CCC 


AAG 


GCT 


CGC 


CAG 


CCC 


GAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


TTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


GTT 


CTG 


GAA 


GAC 


GGC 


GTG 


AAC 


CCC 


GGT 


TGC 


CCT 


TTC 


TCT 


ATC 


TCC 


TGT 


TTA 


ACC 


ATC 


CCA 


GCT 



GGA AGG CGA CAA CCT ATC 195 

GGC AGG ACC TGG GCC CAG 234 

TAT GGC AAT GAG GGC TTG 273 

TCA CCC CGT GGC TCT CGG 312 

CCC CGG CGT AGG TCG CGT 3 51 

ACC CTC ACA TGC GGC TTC 3 90 

CCG CTC GTC GGC GCC CCT 429 

TTG GCG CAT GGC GTC CGG 468 

TAT GCA ACA GGG AAT TTG 507 

TTC CTC TTG GCT TTG CTG 546 

TCC GCT 573 



(2) INFORMATION FOR SEQ ID NO: 110: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY-: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hombsapiens 
(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



ATG AGC ACG AAT CCT AAA CCT CAA AGA CAA ACC AAA CGT 39 

AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGT 78 

20 GGC GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACT AGG AAG ACT 156 

TCC GAG CGG TCA CAA CCT CGT GGA CGG CGA CAA CCT ATC 195 

CCC AAG GCT CGC CGG CCC GAG GGC AGG GCC TGG GCC CAG 234 

CCC GGG CAT CCT TGG CCC CTC TAT GGC AAT GAG GGC TTG 273 

GGG TGG GCA GGA TGG CTC CTG TCA CCC CGT GGC TCC CGG 312 

CCT AGT TGG GGC CCC ACG GAC CCC CGG CGT AGG TCG CGC 351 

AAT TTG GGT AAG GTC ATC GAT ACC CTC ACG TGC GGC TTC 390 

25 GCC GAC CTC ATG GGG TAC ATT CCG CTC GTC GGC GCC CCC 42 9 

CTA GGG GGC GCT GCC AGA GCC TTG GCG CAT GGC GTC CGG 468 

GTT CTG GAG GAC GGC GTG AAC TAT GCA ACA GGG AAT CTG 507 

CCC GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT CTG CTG 546 

TCC TGC TTG ACC ATC CCA GCT TCC GCT 573 



30 (2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 



SUBSTITUTE SHEET (RULE 26) 
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10 



15 



20 



- 158 - 



(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 







\ A. J. / 




SEQUENCE DESCRIPTION: SEQ 


ID NO: 1 


ATG 


AGO 


act; 


rxn. ± 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAC 


GAG 


GGC 


TTG 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCC 


CGG 


CCT 


AGT 


TGG 


GGC 


CCC 


ACC 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


CTA 


GGG 


GGT 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


TCC 


TGT 


TTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 















(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ID NO: 1 


ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAC 


GAG 


GGC 


ATG 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCC 


CGG 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


TTG 


GCG 


CAT 


GGC 


GTC 


CGG 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


TTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


TCC 


TGT 


TTG 


ACC 


ATT 


CCA 


GCT 


TCC 


GCT 









39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
573 



(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 



«ul ami u<_t AAA CCT CAA AGA AAA ACC AAA CGT 3 9 

£&P l^nn SRP r>mr* <-v/-« oo» .-. ~— ^ ^ 

117 
156 
195 
234 
273 
312 

30 -.3 1?°. 99? CCC ACG GAC ccc CGG CGT AGG TCG CGT 351 

390 
429 
468 
507 
54 6 
573 

35 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/10398 



- 159 - 

(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base, pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

10 ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 3 9 

78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
573 



15 



20 



25 



30 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGG 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


TCC 


TGC 


CTG 


ACC 


ATC 


CCA 


GCG 


TCC 


GCT 











(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 



39 
78 
117 
156 
195 
234 
273 

35 GGG TGG GCA GGA TGG CTC CTG TCA CCC CGC GGC TCT CGG 312 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID- <WO_ 960531 5A3 IA> 



WO 96/05315 



PCT/US95/10398 



CCT AGT TGG GGC CCC 
AAT TTG GGT AAG GTC 
GCC GAC CTC ATG GGG 
CTA GGG GGC GCT GCC 
GTT CTG GAG GAC GGC 
CCC GGT TGC TCT TTC 
TCC TGT TTG ACC ATC 



- 160 - 



AAC GAC CCC CGG CGT 
ATC GAT ACC CTC ACA 
TAC ATT CCG CTC GTC 
AGG GCC CTG GCG CAT 
GTG AAC TAC GCA ACA 
TCT ATC TTC CTC TTG 
CCA GCT TCC GCC 



AGG TCG CGT 3 51 

TGC GGC TTC 3 90 

GGC GCC CCC 429 

GGC GTC CGG 4 68 

GGG AAT TTG 507 

GCT CTG TTG 54 6 

573 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T10 



15 



20 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115 



ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAG 


CCT 


ATC 


GCT 


CGC 


CAG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCC 


CGG 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


GGC 


GCT 


GCC 


AGG 


GCT 


CTG 


GCA 


CAT 


GGT 


GTC 


CGG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


CTG 


CTG 


CTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 









39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
466 
507 

25 ir w ■ LWA XAA AV - A -I'i'W GCT CTG CTG 546 

573 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 573 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 

35 <*i) SEQUENCE DESCRIPTION: SEQ ID NO: 116 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/I0398 



- 161 - 



5 



TV m/"< 

ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA- 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


CGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC. 


CAG 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCT 


GGG 


TAC 


CCC 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


273 


GGA 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGC 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACT 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTC 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGT 


CTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 










573 



(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 
15 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND3 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 



25 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGT 


TCT 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


351 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTC 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTA 


GCT 


TTG 


CTA 


546 


TCC 


TGT 


TTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 










573 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 573 base pairs 



SUBSTITUTE SHEET (RULE 26) 
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10 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGC 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


CCC 


GGG 


CAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TTG 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 


CCT 


AGT 


TGG 


GGC 


CCC 


ACA 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


CTA 


GGG 


GGT 


GCT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


GTC 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAC 


TTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCT 


TTG 


CTA 


TCC 


TGT 


TTG 


ACC 


GTC 


CCA 


GCT 


TCC 


GCT 









(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 

v.a« ««« vjvai. uLi on. nwj viL.-<J LTU UCCi CAT GGC GTC CGG 468 

507 
54 6 
573 

(2) INFORMATION FOR SEQ ID NO : 119: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

2 <5 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S9 



39 
78 
117 

30 -^Z EE? TTG GGT GTG CGC GCA ACT AGG AAG ACT 156 

195 
234 
273 
312 
351 
390 

j.j-»v- nj. a n_\_vj v_iv_ V3i\_ va»jV_ VjUt_ L-l_l_" 429 

JJ CTA ctcza cznr r:rr rsnr* irw nr^r r^r>r> r*r*r> ^» m /-.^^ 468 







(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ID NO: 1 


ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTC 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCA 


ACT 


AGG 


AAG 


ACT 


TCC 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CCC 


AAG 


GCT 


CGC 


CAT 


CCC 


GAG 


GGC 


AGG 


GCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAC 


GGC 


AAT 


GAG 


GGC 


TTG 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGG 


CCT 


AGT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGT 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTT 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


CTA 


GGG 


GGC 


GCT 


GCC 


AGG 


GCT 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 
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GTT CTG GAG GAC GGC GTG AAC TAT GCA ACA GGG AAC CTC 507 
CCC GGT TGC TCT TTC TCT ATC TTC CTT CTG GCT TTG CTG 546 
TCC TGT TTG ACC ATC CCA GCT TCC GCT 573 



(2) INFORMATION FOR SEQ ID NO: 120: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACC 


AGG 


AAG 


ACT 


156 


TCA 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CAA 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


TAT 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAC 


GAG 


GGC 


ATG 


273 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACG 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGT 


GCC 


CCC 


429 


CTA 


GGG 


GGC 


GTT 


GCC 


AGA 


GCC 


TTG 


GCA 


CAT 


GGT 


GTC 


CGG 


468 


GTT 


CTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTA 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


TTG 


CTG 


546 


TCC 


TGC 


TTG 


ACC 


ACC 


CCA 


GCT 


TCC 


GCT 










573 



25 



(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
<C) INDIVIDUAL ISOLATE: HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 39 
AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 78 
35 GGT GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC 117 



30 



SUBSTITUTE SHEET (RULE 26) 



WO96/053I5 



PCT/US95/10398 



- 164 - 



AGG GGC CCC AGG TTG GGT GTG CGC GCG ACC AGG AAG ACT 156 

TCC GAG CGG TCG CAA CCT CGT GGA AGG CGA CAA CCT ATC 195 

CCC AAG GCT CGC CGA CCC GAG GGC AGG ACC TGG GCT CAG 234 

CCC GGG TAT CCT TGG CCC CTC TAT GGC AAT GAG GGC ATG 273 

GGG TGG GCA GGA TGG CTC CTG TCA CCC CAT GGC TCT CGG 312 

CCT AGT TGG GGC CCC ACG GAC CCC CGG CGT AGG TCG CGT 351 

AAT TTG GGT AAG GTC ATC GAT ACC CTC ACG TGC GGC TTC 390 

GCC GAC CTC ATG GGG TAC ATC CCG CTC GTC GGC GCC CCC 42 9 

CTA GGG GGC GTT GCC AGA GCC CTG GCA CAC GGT GTC CGG 4 68 

GTT CTG GAG GAC GGC GTG AAC TAC GCA ACA GGG AAT ATA 507 

CCC GGT TGC TCT TTC TCT ATC TTC CTT TTG GCT TTG CTG 54 6 

TCC TGT CTG ACC ACC CCA GTT TCC GCT 573 

10 (2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

39 
78 
117 
156 
195 
234 
273 
312 

2<5 * ww wv - , -'- v - **«-^» <_i_U UfcKi CGT AGG TCG CGC 351 

390 
429 
468 
507 
546 
573 



20 



30 



AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAG 


ACC 


AAA 


CGT 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


ATC 


GTC 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


TCG 


CAA 


CCT 


CGT 


GGA 


AGG 


CGA 


CAA 


CCT 


ATC 


CGC 


CAA 


CCC 


GAG 


GGC 


AGG 


ACC 


TGG 


GCT 


CAG 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


ATG 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCT 


CGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCC 


GTT 


GCC 


AGA 


GCC 


CTG 


GCA 


CAT 


GGT 


GTC 


CGG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCT 


CTG 


CTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 









35 



(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hotnosapiens 
(C) INDIVIDUAL ISOLATE: P8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 



5 



10 



ATG 


AGC 


ACG 


ACT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


39 


AAC 


ACC 


AGC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


CTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


156 


TCC 


GAG 


CGA 


TCG 


CAA 


CCT 


CGT 


GGC 


AGG 


CGA 


CAA 


CCT 


ATC 


195 


CCC 


AAG 


GCT 


CGC 


CGG 


CCC 


GAG 


GGT 


AGG 


GCC 


TGG 


GCT 


CAG 


234 


CCC 


GGG 


CAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GCC 


AAT 


GAG 


GGC 


TTG 


273 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGC 


GGC 


TCC 


CGG 


312 


CCT 


AGT 


TGG 


GGC 


CCC 


ACG 


GAC 


CCC 


CGG 


CGT 


AGG 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTC 


ACA 


TGC 


GGC 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GGC 


CCC 


429 


CTA 


GGG 


GGC 


GTT 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


CGG 


468 


GTT 


GTG 


GAG 


GAC 


GGC 


GTG 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCT 


TTG 


CTG 


546 


TCT 


TGT 


CTG 


ACC 


ATC 


CCA 


GCT 


TCC 


GCT 










573 



15 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(vi) - ORIGINAL SOURCE: 

(A) ORGANISM: hotnosapiens 
(C) INDIVIDUAL ISOLATE: T3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 124: 

25 ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 3 9 

AAC ACC AAC CGC CGC CCA CAG GAC GTT AAG TTC CCG GGC 78 

GGT GGT CAG ATC GTT GGT GGA GTT TAC CTG TTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACT AGG AAG ACT 156 

TCC GAG CGG TCG CAA CCT CGT GGA AGG CGA CAA CCT ATC 195 

CCC AAG GCT CGC CGG CCC GAG GGT AGG GCC TGG GCT CAG 234 

CCC GGG TAC CCT TGG CCC CTC TAT GGC GAC GAG GGC ATG 273 

30 GGG TGG GCA GGA TGG CTC CTG TCA CCC CGC GGC TCC CGG 312 

CCT AAT TGG GGC CCC ACA GAC CCC CGG CGT AGG TCG CGT 351 

AAT CTG GGT AAG GTC ATC GAT ACC CTC ACA TGC GGC TTC 390 

GCC GAC CTC ATG GGG TAC ATT CCG CTC GTC GGC GCT CCC 42 9 

TTA GGG GGC GTT GCC AGG GCC CTG GCG CAT GGC GTC CGG 468 

GTT CTG GAG GAC GGC GTG AAT TAC GCA ACA GGG AAT TTG 507 

CCT GGT TGC TCT TTC TCT ATC TTC CTC TTG GCT TTG CTG 546 

TCC TGC TTG ACC ATC CCA GCT TCC GCT 573 
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(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 



15 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


78 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


156 


TCG 


GAG 


CGA 


TCC 


CAG 


CCA 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC 


ATC 


195 


CCC 


AAA 


GAT 


CGG 


CGC 


TCC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


234 


CCA 


GGA 


TAT 


CCC 


TGG 


CCC 


CTG 


TAT 


GGG 


AAT 


GAG 


GGA 


CTC 


273 


GGC 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGA 


GGT 


TCC 


CGT 


312 


CCC 


TCC 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CAT 


AGG 


TCG 


CGC 


351 


AAC 


GTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


AGC 


CTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


GTC 


CCC 


GTC 


GTA 


GGC 


GGC 


CCG 


429 


TTG 


GGT 


GGC 


GTC 


GCC 


AGA 


GCT 


CTC 


GCG 


CAT 


GGC 


GTG 


AGA 


468 


GTC 


CTG 


GAG 


GAC 


GGG 


GTT 


AAT 


TAT 


GCA 


ACA 


GGG 


AAC 


TTA 


507 


CCT 


GGT 


TGC 


TCC 


TTT 


TCT 


ATT 


TTC 


TTG 


CTG 


GCC 


CTA 


CTG 


546 


TCC 


TGC 


ATC 


ACC 


ATT 


CCA 


GTC 


TCC 


GCT 










573 



20 



(2) INFORMATION FOR SEQ ID NO: 126: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US10 



(i) 



25 



30 <xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

ATG AGC ACA AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 39 

AAC ACT AAC CGT CGC CCA CAA GAC GTT AAG TTT CCG GGC 78 

GGC GGC CAG ATC GTT GGC GGA GTA TAC TTG TTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACA AGG AAG ACT 156 

TCG GAG CGG TCC CAG CCA CGT GGG AGG CGC CAG CCC ATC 195 

CCC AAA GAT CGG CGC CCC ACT GGC AAG TCC TGG GGA AAA 234 

35 CCA GGA TAC CCT TGG CCC CTA TAT GGG AAT GAG GGA CTC 273 
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20 



GGC 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGA 


GGT 


TCC 


CGT 


CCC 


TCT 


TGG 


GGC 


CCC 


ACT 


GAT 


CCC 


CGG 


CAT 


AGG 


TCG 


CGC 


AAC 


GTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGC 


TTT 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCC 


GTC 


GTG 


GGC 


GCT 


CCG 


CTT 


GGT 


GGC 


GTC 


GCC 


AGA 


GCT 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


GTC 


CTG 


GAG 


GAC 


GGG 


GTT 


AAT 


TAT 


GCA 


ACA 


GGG 


AAC 


TTA 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


TTG 


CTG 


GCC 


TTA 


CTG 


TCC 


TGC 


ATC 


ACC 


ATT 


CCA 


GTC 


TCT 


GCT 











312 
351 
390 
429 
468 
507 
546 
573 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

15 _ 

39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 

25 COT GGT TGC TCT TTT TCT ATC TTC TTG CTG GCC CTA CTG 54 6 

573 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 12 


ATG 


AGC 


ACA 


AAT 


CCA 


AAA 


CCC 


CAA 


AGA 


AAA 


ACC 


ATA 


AGA 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


ACG 


ACA 


AGG 


AAG 


ACT 


TCG 


GAG 


CGG 


TCC 


CAG 


CCA 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC 


ATC 


CCC 


AAA 


GAT 


CGG 


CGC 


TCC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


CCA 


GGA 


TAC 


CCC 


TGG 


CCT 


CTA 


TAT 


GGG 


AAT 


GAG 


GGA 


CTC 


GGC 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCC 


CCC 


CGA 


GGT 


TCC 


CGT 


CCC 


TCT 


TGG 


GGC 


CCC 


AGT 


GAC 


CCC 


CGG 


CAT 


AGG 


TCG 


CGC 


AAC 


GTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGC 


TTT 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCC 


GTC 


GTA 


GGC 


GCC 


CCG 


CTT 


GGT 


GGC 


GTT 


GCC 


AGA 


GCT 


CTC 


GCG 


CAC 


GGC 


GTG 


AGA 


GTC 


CTG 


GAG 


GAC 


GGG 


GTT 


AAT 


TAT 


GCA 


ACA 


GGG 


AAC 


CTA 


CCT 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


TTG 


CTG 


GCC 


CTA 


CTG 


TCC 


TGC 


ATC 


ACC 


ACT 


CCG 


GCC 


TCT 


GCT 











(2) INFORMATION FOR SEQ ID NO: 128: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 

35 



30 



(i) 



BNSDOCID- <WO . 9605*11?**' IA> 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 



ATG AGC ACA ATT CCT AAA CCT CAA AGA AAA ACC AAA AGA 3 9 

AAC ACT AAC CGT CGC CCA CAA GAC GTT AAG TTT CCG GGC 78 

GGC GGC CAG ATC GTT GGC GGA GTA TAC TTG CTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACA AGG AAG ACT 156 

5 TCG GAG CGG TCC CAG CCT CGT GGA AGG CGC CAG CCC ATC 195 

CCT AAA GAT CGG CGC TCC ACT GGC AAG TCC TGG GGA AAA 234 

CCA GGA TAC CCC TGG CCC CTG TAT GGG AAT GAG GGG CTC 273 

GGC TGG GCA GGA TGG CTC CTG TCC CCC CGA GGT TCT CGT 312 

CCC TCT TGG GGC CCC AAT GAC CCC CGG CAT AGG TCG CGC 351 

AAT GTG GGT AAA GTC ATC GAT ACC CTA ACG TGC GGC TTT 3 90 

GCC GAC CTC ATG GGG TAC ATC CCC GTC GTA GGC GCC CCG 42 9 

CTT GGT GGT GTC GCC AGA GCT CTT GCG CAT GGC GTG AGA 468 

10 GTC CTG GAG GAC GGA GTT AAT TAT GCA ACA GGT AAC TTA 507 

CCC GGT TGC TCC TTT TCT ATC TTC TTG CTA GCC CTG CTG 546 

TCC TGC ATC ACT ATT CCG GTT TCA GCT 573 



(2) INFORMATION FOR SEQ ID NO : 129: 

(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: T8 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 

30 GTC GGA GGC GTC GCC AGA GCT CTG GCA CAT GGT GTT AGG 468 



25 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 12 


ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACA 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


TCC 


GAG 


CGA 


TCC 


CAG 


CCG 


CGT 


GGG 


AGA 


CGC 


CAG 


CCC 


ATC 


CCG 


AAA 


GAT 


CGG 


CGC 


TCC 


ACC 


GGC 


AAG 


TCC 


TGG 


GGA 


AAA 


CCA 


GGA 


TAT 


CCT 


TGG 


CCT 


CTT 


TAC 


GGA 


AAC 


GAG 


GGC 


TGC 


GGT 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGG 


TCT 


CGT 


CCT 


ACT 


TGG 


GGC 


CCC 


ACT 


GAC 


CCC 


CGG 


CAT 


AGA 


TCA 


CGT 


AAT 


TTG 


GGC 


AGA 


GTC 


ATC 


GAT 


ACC 


ATT 


ACA 


TGT 


GGT 


TTT 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCT 


GTC 


GTT 


GGC 


GCC 


CCG 


GTC 


GGA 


GGC 


GTC 


GCC 


AGA 


GCT 


CTG 


GCA 


CAT 


GGT 


GTT 


AGG 


GTC 


CTG 


GAA 


GAC 


GGG 


ATA 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


CCT 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


TTG 


CTT 


GCT 


CTT 


CTG 


TCA 


TGC 


TTC 


ACA 


GTG 


CCA 


GTG 


TCT 


GCA 











(2) INFORMATION FOR SEQ ID NO: 130: 
35 (i) SEQUENCE CHARACTERISTICS: 



507 
546 
573 
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(A) 
(B) 
(C) 
(D) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 



ATG AGC ACA AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 3 9 

AAC ACA AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGT 78 

GGC GGT CAG ATC GTT GGC GGA GTT TAC TTG CTG CCG CGC 117 

10 AGG GGC CCC AGG TTG GGT GTG CGC GCG ACA AGG AAG ACT 156 

TCC GAG CGA TCC CAG CCG CGT GGG AGA CGC CAG CCC ATC 195 

CCG AAA GAT CGG CGC TCC ACC GGC AAG TCC TGG GGA AAG 234 

CCA GGA TAT CCT TGG CCT CTG TAC GGA AAC GAG GGC TGC 273 

GGC TGG GCA GGT TGG CTC CTG TCC CCC CGC GGG TCT CGT 312 

CCT ACT TGG GGC CCC ACT GAC CCC CGG CAC AGA TCA CGT 351 

AAC TTG GGC AAG GTC ATC GAT ACC ATT ACG TGT GGT TTT 390 

GCC GAC CTC ATG GGG TAC ATC CCT GTC GTT GGC GCC CCG 429 

15 GTC GGA GGC GTC GCC AGA GCT CTG GCA CAC GGT GTT AGG 468 

GTC CTG GAA GAC GGG ATA AAT TAC GCA ACA GGG AAT CTG 507 

CCT GGT TGC TCC TTT TCT ATC TTC TTA CTT GCT CTT CTG 54 6 

TCG TGC GCC ACG GTG CCG GTG TCT GCA 573 



LENGTH: 573 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 



(2) INFORMATION FOR SEQ ID NO: 131: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK11 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 



ATG AGC ACA AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 3 9 

AAT ACA AAC CGC CGC CCA CAG GAC GTT AAG TTC CCG GGT 78 

30 GGC GGC CAG ATC GTT GGC GGA GTT TAC TTG CTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC ACG ACA AGG AAG ACT 156 

TCC GAG CGA TCC CAG CCG CGT GGG AGA CGC CAG CCC ATC 195 

CCG AAA GAT CGG CGC TCC ACC GGC AAG CCC TGG GGA AAG 234 

CCA GGA TAT CCT TGG CCC CTG TAT GGA AAC GAG GGC TGC 273 

GGC TGG GCA GGT TGG CTC CTG TCC CCC CGC GGG TCT CAT 312 

CCT AAT TGG GGC CCC ACT GAC CCC CGG CAT AAA TCA CGC 351 

AAT TTG GGT AAA GTC ATC GAC ACC ATT ACG TGT GGT TTT 3 90 

35 GCC GAC CTC ATG GGG TAC ATC CCT GTC GTC GGC GCC CCG 429 
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GTC GGA GGC GTC GCC AGA GCT CTG GCA CAC GGT GTT AGA 468 

GTC CTG GAA GAC GGG ATA AAT TAC GCA ACA GGG AAT CTG 507 

CCT GGT TGC TCT TTT TCT ATC TTC TTA CTT GCT CTT CTG 54 6 

TCA TGC TGC ACA GTG CCA GTG TCT GCG 573 

5 (2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 132: 

ATG AGC ACA AJ 
AAT ACA AAC C< 



20 



CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


78 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


TTG 


GGT 


GTG 


CGC 


GCG 


ACA 


AGG 


AAG 


ACT 


156 


CAG 


CCG 


CGT 


GGG 


AGA 


CGC 


CAG 


CCC 


ATC 


195 


CGC 


TCC 


ACC 


GGC 


AAG 


TCC 


TGG 


GGA 


AAG 


234 


TGG 


CCC 


CTG 


TAT 


GGA 


AAC 


GAG 


GGC 


TGC 


273 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGG 


TCT 


CAT 


312 


CCC 


ACT 


GAC 


CCC 


CGG 


CAT 


AGA 


TCA 


CGC 


351 


GTC 


ATC 


GAC 


ACC 


ATT 


ACG 


TGT 


GGT 


TTT 


390 


GGG 


TAC 


ATC 


CCT 


GTC 


GTT 


GGC 


GCC 


CCG 


429 


GCC 


AGA 


GCT 


CTG 


GCA 


CAC 


GGT 


GTT 


AGA 


468 


GGG 


ATA 


AAT 


TAC 


GCA 


ACA 


GGG 


AAT 


CTG 


507 


TTT 


TCT 


ATC 


TTC 


TTA 


CTT 


GCT 


CTT 


CTG 


546 


GTG 


CCA 


GTG 


TCT 


GCG 










573 



25 (2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

ATG AGC ACA AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 39 
& AAC ACA AAC CGC CGC CCA CAG GAC GTT AAG TTC CCG GGT 78 
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GGC GGC CAG ATC GTT GGC GGA GTT TAC TTG CTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACA AGG AAG TCT 156 

TCC GAG CGA TCC CAG CCG CGT GGG AGG CGC CAG CCC ATC 195 

CCG AAA GAT CGG CGC TCC ACC GGC AAG TCC TGG GGA AAA 234 

CCG GGA TAT CCT TGG CCC CTG TAT GGA AAC GAG GGC TGC 273 

GGC TGG GCA GGT TGG CTC CTG TCC CCC CGC GGG TCT CGT 312 

CCT ACT TGG GGC CCC ACT GAC CCC CGG . CAT AGA TCA CGC 351 

AAT TTG GGC AAA GTC ATC GAC ACC ATT ACG TGT GGT TTT 3 90 

GCC GAC CTC ATG GGG TAC ATC CCT GTC GTT GGC GCC CCG 42 9 

GTT GGA GGC GTC GCC AGA GCT CTG GCA CAC GGT GTT AGG 4 68 

GTC CTG GAA GAC GGG ATA AAT TAC GCA ACA GGG AAT TTG 507 

CCT GGT TGC TCT TTT TCT ATC TTC TTG CTT GCT CTT CTG 54 6 

TCG TGC TGC ACA GTG CCA GTG TCT GCG 573 



10 

(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 



25 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACT 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTA 


TAC 


TTG 


CTG 


CCG 


CGC 


117 


AGG 


GGC 


CCG 


AGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGG 


AAA 


ACT 


156 


TCC 


GAA 


CGG 


TCC 


CAG 


CCA 


CGT 


GGG 


AGG 


CGC 


CAG 


CCC 


ATC 


195 


CCT 


AAA 


GAT 


CGG 


CGC 


ACC 


ACT 


GGC 


AAG 


TCC 


TGG 


GGA 


AGG 


234 


CCA 


GGA 


TAC 


CCT 


TGG 


CCC 


CTG 


TAT 


GGG 


AAT 


GAG 


GGC 


CTC 


273 


GGC 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGT 


TCT 


CGC 


312 


CCT 


TCA 


TGG 


GGC 


CCC 


ACC 


GAC 


CCC 


CGG 


CAT 


AAA 


TCG 


CGC 


351 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGT 


TTT 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATA 


CCC 


GTC 


GTT 


GGC 


GCT 


CCC 


429 


GTT 


GGC 


GGC 


GTT 


GCC 


AGA 


GCC 


CTC 


GCC 


CAT 


GGG 


GTG 


AGG 


468 


GTT 


CTG 


GAG 


GAC 


GGG 


ATA 


AAT 


TAT 


GCA 


ACG 


GGG 


AAT 


TTG 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


CTC 


TTG 


GCC 


CTC 


TTG 


546 


TCT 


TGC 


ATC 


TCT 


GTG 


CCA 


GTT 


TCC 


GCC 










573 



30 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 



ENSDOCID' -fWQ P605*?l5fl.T ia> 
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WO 96/05315 



PCT/US95/10398 



- 172 - 

o 



10 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 

j ^ ~ w * iV * s - x n± A v - rtl vjui ui_T aut 573 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 



25 



30 



ATG 


AGC 


ACA 


CTT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACC 


ATC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


GGC 


GGA 


CAG 


ATC 


GTT 


GGT 


GGA 


GTA 


TAC 


GTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCA 


CGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


CGT 


AAA 


ACT 


TCT 


GAA 


CGG 


TCG 


CAG 


CCT 


CGC 


GGA 


CGA 


CGA 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGT 


CGG 


AGC 


GAA 


GGC 


CGG 


TCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGT 


AAC 


GAG 


GGC 


TGC 


GGG 


TGG 


GCA 


GGA 


TGG 


CTC 


CTG 


TCC 


CCA 


CGC 


GGC 


TCC 


CGT 


CCA 


TCT 


TGG 


GGC 


CCA 


AAC 


GAC 


CCC 


CGG 


CGA 


CGG 


TCC 


CGC 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGA 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCT 


CCC 


GTA 


GGA 


GGC 


GTC 


GCA 


AGA 


GCC 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTC 


GCA 


ACA 


GGG 


AAC 


TTG 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTT 


CTT 


GCT 


CTG 


TTC 


TCT 


TGC 


TTA 


ATT 


CAT 


CCA 


GCA 


GCT 


AGT 









(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136 



35 



ATG 


AGC 


ACA 


CTT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACC 


ATC 


CGT 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


GGC 


GGA 


CAG 


ATC 


GTT 


GGT 


GGA 


GTA 


TAC 


GTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCA 


CGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


CGT 


AAA 


ACT 


TCT 


GAA 


CGG 


TCA 


CAG 


CCT 


CGC 


GGA 


CGA 


CGA 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGT 


CGG 


AGC 


GAA 


GGC 


CGG 


TCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGT 


AAT 


GAG 


GGC 


TGC 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCA 


CGC 


GGC 


TCC 


CGT 


CCA 


TCT 


TGG 


GGC 


CCA 


AAC 


GAC 


CCC 


CGG 


CGG 


AGG 


TCC 


CGC 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGA 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCT 


CCC 


GTA 


GGA 


GGC 


GTC 


GCA 


AGA 


GCC 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTT 


GCA 


ACA 


GGG 


AAC 


TTG 


CCC 


GGT 


TGC 


TCC 


TTT 


TCT 


ATC 


TTC 


CTT 


CTT 


GCT 


CTG 


TTC 


TCC 


TGC 


TTA 


GTT 


CAT 


CCT 


GCA 


GCT 


AGT 







39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
54 6 
573 
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o 



10 



15 



(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear. 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 

20 TCT TGC TTA ATT CAT CCA GCA GCT AGT 573 



ATG 


AGC 


ACA 


CTT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACC 


ATC 


CGT 


CGC 


CCA 


CAG 


GAC 


ATC 


AAG 


TTC 


CCG 


GGT 


GGC 


GGA 


CAG 


ATC 


GTT 


GGT 


GGA 


GTA 


TAC 


GTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCA 


CGA 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


CGT 


AAA 


ACT 


TCT 


GAA 


CGG 


TCA 


CAG 


CCT 


CGC 


GGA 


CGG 


CGA 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGT 


CGG 


AGC 


GAA 


GGC 


CGA 


TCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGT 


AAC 


GAG 


GGC 


TGC 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCA 


CGC 


GGC 


TCC 


CGT 


CCA 


TCT 


TGG 


GGC 


CCA 


AAT 


GAC 


CCC 


CGG 


CGG 


AGG 


TCC 


CGC 


AAT 


TTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTT 


ACG 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTC 


GGC 


GCT 


CCC 


GTA 


GGA 


GGC 


GTC 


GCA 


AGA 


GCC 


CTC 


GCG 


CAT 


GGC 


GTG 


AGG 


GCC 


CTT 


GAA 


GAC 


GGG 


ATA 


AAT 


TTT 


GCA 


ACA 


GGG 


AAC 


TTG 


CCC 


GGT 


TGC 


TCT 


TTT 


TCT 


ATC 


TTC 


CTT 


CTT 


GCC 


CTG 


TTC 


TCT 


TGC 


TTA 


ATT 


CAT 


CCA 


GCA 


GCT 


AGT 











(2) INFORMATION FOR SEQ ID NO: 138: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 



25 



(i) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 



ATG AGC ACA CTT CCT AAA CCT CAA AGA AAA ACC AAA AGA 39 

AAC ACC ATC CGT CGC CCA CAG GAC GTC AAG TTC CCG GGT 78 

GGC GGA CAG ATC GTT GGT GGA GTA TAC GTG TTG CCG CGC 117 

AGG GGC CCA CGA TTG GGT GTG CGC GCG ACG CGT AAA ACT 156 

TCT GAA CGG TCA CAG CCT CGC GGA CGG CGA CAG CCT ATC 195 

CCC AAG GCG CGT CGG AGC GAA GGC CGG TCC TGG GCT CAG 234 

35 CCT GGG TAC CCT TGG CCC CTC TAT GGT AAC GAG GGC TGC 273 
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GGG TGG GCA GGG TGG 
CCA TCT TGG GGC CCA 
AAT TTG GGT AAG GTC 
GCC GAC CTC ATG GGG 
GTA GGG GGC GTC GCA 
GCC CTT GAA GAC GGG 
CCC GGT TGC TCC TTT 
TCT TGC CTA ATT CAT 



- 174 - 



CTC CTG TCC CCA CGC 
AAC GAC CCC CGG CGG 
ATC GAT ACC CTC ACG 
TAC ATC CCG CTC GTC 
AGA GCC CTC GCG CAT 
ATA AAT TTC GCA ACA 
TCT ATC TTC CTT CTT 
CCA GCA GCT AGT 



GGC TCC CGT 312 

AGG TCC CGC 3 51 

TGC GGA TTC 3 90 

GGC GCT CCT 42 9 

GGC GTG AGG 46 8 

GGG AAC TTG 507 

GCT CTG TTC 54 6 

573 



(2) INFORMATION FOR SEQ ID NO: 139: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 573 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 9 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGC 


CGC 


CCC 


ATG 


GAC 


GTA 


AAG 


TTC 


CCG 


GGT 


GGT 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGA 


AAG 


ACT 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGC 


AGG 


CGT 


CAA 


CCT 


ATC 


CCC 


AAG 


GCG 


CGC 


CAG 


CCA 


GAG 


GGC 


AGA 


TCC 


TGG 


GCG 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTC 


TAT 


GGC 


AAT 


GAG 


GGC 


TGC 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCT 


CCT 


CGC 


GGC 


TCT 


CGG 


CCA 


TCT 


TGG 


GGC 


CCA 


AAT 


GAT 


CCC 


CGG 


CGG 


AGA 


TCG 


CGC 


AAT 


CTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCG 


ATC 


GTG 


GGC 


GCC 


CCC 


GTG 


GGG 


GGC 


GTC 


GCC 


AGG 


GCT 


CTG 


GCG 


CAT 


GGC 


GTC 


AGG 


GCT 


GTG 


GAG 


GAC 


GGG 


ATT 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


CCC 


GGT. 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCA 


CTT 


CTT 


TCG 


TGC 


CTC 


ACT 


GTT 


CCA 


GCG 


TCG 


GCT 









39 
78 
117 
156 
195 

20 ~- -S- E?? ^ AG CCA GAG GGC AGA TCC TGG GCG CAG 234 

273 
312 
351 
390 
429 
468 

«j.v3 o/*^ »jva(j ah AAU TAT GCA ACA GGG AAT CTT 507 

iJ Crv nnT TV?/" m/-«m mm/i mswn -» m*-* ~_ 

546 
573 

(2) INFORMATION FOR SEQ ID NO: 14 0: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
35 (C) INDIVIDUAL ISOLATE: Z8 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 0: 



ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 3 9 

AAC ACC AAC CGC CGC CCT ATG GAT GTA AAA TTC CCA GGC 78 

GGC GGC CAG ATC GTT GGC GGA GTT TAC TTG TTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACT CGG AAG ACT 156 

TCG GAG CGG TCG CAA CCT CGT GGC AGG CGT CAG CCT ATC 195 

5 CCC AAG GCA CGT CGG TCC GAG GGT AGG TCC TGG GCT CAG 234 

CCC GGG TAC CCA TGG CCT CTT TAC GGT AAT GAA GGC TGT 273 

GGG TGG GCA GGT TGG CTC CTG TCC CCC CGC GGC TCT CGA 312 

CCG TCT TGG GGC CCA AAT GAT CCC . CGG CGG AGG TCG CGC 351 

AAT TTG GGT AAG GTC ATC GAT ACC CTC ACG TGC GGC TTC 390 

GCC GAC CTC ATG GGA TAC ATC CCG CTC GTG GGC GCC CCA 429 

GTA GGA GGC GTC GCC AGA GCC CTG GCG CAT GGC GTC AGG 468 

10 GCT GTG GAG GAC GGG ATC AAC TAT GCA ACA GGG AAC CTT 507 

CCT GGT TGC TCT TTC TCT ATC TTC CTC TTG GCA CTT CTC 546 

TCG TGC CTA ACC GTC CCA GCG TCT GCT 573 



(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: Zl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141 



25 



30 



ATG 


AGC 


ACA 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


AAC 


ACC 


AAC 


CGT 


CGC 


CCC 


ATG 


GAT 


GTG 


AAA 


TTC 


CCG 


GGC 


GGC 


GGC 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


CTG 


CCG 


CGC 


AGG 


GGC 


CCC 


CGG 


TTG 


GGT 


GTG 


CGC 


GCA 


GCT 


CGG 


AAG 


ACT 


TCG 


GAG 


CGG 


TCA 


CAA 


CCT 


CGT 


GGC 


AGG 


CGT 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGC 


CGG 


TCC 


GAG 


GGC 


AGG 


TCC 


TGG 


GCT 


CAG 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GGC 


AAT 


GAG 


GGC 


TGT 


GGG 


TGG 


GCA 


GGG 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGT 


TCC 


AGG 


CCG 


TCT 


TGG 


GGC 


CCC 


AAT 


GAT 


CCC 


CGG 


CGT 


AGG 


TCC 


CGT 


AAT 


CTG 


GGT 


AAA 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGT 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATT 


CCG 


CTC 


GTA 


GGC 


GCC 


CCT 


GTG 


GGT 


GGC 


GTC 


GCC 


AGG 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


AGG 


GCC 


GTG 


GAG 


GAC 


GGA 


ATT 


AAC 


TAC 


GCA 


ACA 


GGG 


AAC 


CTT 


CCT 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


CTT 


CTT 


GCA 


CTT 


CTC 


TCG 


TGC 


CTG 


ACA 


ACA 


CCA 


GCA 


TCT 


GCC 










(2) 


INFORMATION FOR SEQ ID NO: 


: 142: 









(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 573 base pairs 
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176 - 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142 



AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


CGC 


CGC 


CCC 


ATG 


GAT 


GTA 


AAA 


TTC 


CCG 


GGT 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


TCG 


CAA 


CCT 


CGC 


GGC 


AGG 


CGT 


CAG 


CCT 


ATC 


CGT 


CGG 


TCC 


GAG 


GGC 


AGG 


TCC 


TGG 


GCT 


CAG 


CCT 


TGG 


CCT 


CTT 


TAT 


GGC 


AAT 


GAG 


GGC 


TGT 


GGG 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGA 


TCT 


CGG 


GGC 


CAA 


AAT 


GAT 


CCC 


CGG 


CGT 


AGG 


TCC 


CGC 


AAG 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGT 


GGC 


TTC 


ATG 


GGA 


TAC 


ATT 


CCG 


CTC 


GTC 


GGC 


GCC 


CCA 


GTC 


GCC 


AGG 


GCC 


TTG 


GCG 


CAT 


GGC 


GTC 


AGG 


GAC 


GGA 


ATC 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


TCC 


TTT 


TCT 


ATC 


TTC 


CTA 


CTT 


GCA 


CTT 


TTC 


ACA 


ACA 


CCG 


GCA 


TCC 


GCT 









39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
573 

(2) INFORMATION FOR SEQ ID NO: 143: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

2 5 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z6 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

39 
78 
117 
156 
195 
234 
273 
312 

~"~ " ~* J*"" w> " v - v -" WW v-<-<- t-U<i L'UA AGG TCC CGC 351 

AA<_ TTG GGT AAG GTC ATC GAT ar"T r«ra 

oc Z," w ~ A ^ LTTL: (jTA GGC GCC CCC 429 

468 



AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


CGT 


CGC 


CGC 


CCC 


ATG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGT 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


AGG 


AAG 


ACT 


TCG 


CAA 


CCT 


CGT 


GGG 


AGA 


CGC 


CAG 


CCT 


ATC 


CGT 


CGA 


TCT 


GAG 


GGA 


AGG 


TCC 


TGG 


GCT 


CAG 


CCA 


TGG 


CCT 


CTT 


TAC 


GGT 


AAT 


GAG 


GGT 


TGC 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGA 


GGT 


CCA 


AAT 


GAT 


CCC 


CGG 


CGA 


AGG 


TCC 


CGC 


AAG 


GTC 


ATC 


GAT 


ACT 


CTA 


ACT 


TGC 


GGT 


TTC 


ATG 


GGA 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GCC 


CCC 


GTC 


GCC 


AGG 


GCC 


CTG 


GCA 


CAT 


GGT 


GTT 


AGG 
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GCT GTG GAG GAC GGG ATC AAT TAT GCA ACA GGG AAT CTT 507 
CCC GGTTGC TCT TTC TCT ATC TTC CTC TTG GCA CTT CTT 546 
TCG TGC CTA ACT GTT CCC ACC TCG GCC 573 



(2) INFORMATION FOR SEQ ID NO: 144: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) . STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 



ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 3 9 

AAC ACC AAC CGC CGC CCC ATG GAC GTT AAG TTC CCG GGC 78 

GGT GGC CAG ATC GTT GGC GGA GTT TAC TTG TTG CCG CGC 117 

15 AGG GGC CCC AGA TTG GGT GTG CGC ACA ACT AGG AAG ACT 156 

TCG GAG CGG TCG CAA CCT CGT GGG AGA CGT CAG CCT ATC 195 

CCC AAG GCA CGT CGA TCT GAG GGA AGG TCC TGG GCT CAA 234 

CCC GGG TAC CCA TGG CCT CTT TAC GGT AAC GAG GGT TGC 273 

GGG TGG GCA GGA TGG CTC TTG TCA CCC CGT GGC TCT CGA 312 

CCG TCT TGG GGC CCA AAT GAT CCC CGG CGA AGG TCC CGC 3 51 

AAC TTG GGT AAG GTC ATC GAT ACC CTA ACC TGC GGC TTT 3 90 

20 GCC GAC CTC ATG GGA TAC ATC CCG CTC GTA GGC GCC CCC 4 29 

GTG GGC GGC GTC GCC AGG GCC CTA GCG CAT GGC GTT AGG 4 68 

GCT CTG GAG GAC GGG ATT AAT TAT GCA ACA GGG AAC CTT 507 

CCC GGT TGC TCT TTT TCT ATC TTC CTC TTG GCA CTT CTT 546 

TCG TGC CTG ACT GTT CCC GCC TCG GCC 5 73 



(2) INFORMATION FOR SEQ ID NO: 145: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA CGT 39 
AAC ACC AAC .CGC CGC CCA ATG GAC GTT AAG TTC CCG GGT 78 
35 GGC GGC CAG ATC GTT GGC GGA GTT TAC TTG TTG CCG CGC 117 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



- 178 - 



30 



AGG 


GGC 


CCT 


AGA 


TTG 


GGT 


GTG 




ncn 


f\\ j. 




A Arc 


AL I 


TCG 


GAG 


CGG 


TCG 


CAA 


CCT 


CGT 


GGG 


AGG 


CGC 


CAG 


CCT 


ATC 


CCC 


AAG 


GCG 


CGC 


CAA 


CTC 


GAG 


GGT 


AGG 


TCC 


TGG 


GCT 


CAG 


CCT 


GGG 


TAT 


CCT 


TGG 


CCC 


CTT 


TAC 


GGC 


AAT 


GAG 


GGC 


TGC 


GGG 


TGG 


GCG 


GGA 


TGG 


CTC 


CTG 


TCA 


CCC 


CGT 


GGC 


TCT 


CGG 


CCG 


TCT 


TGG 


GGC 


CCG 


AAT 


GAT 


CCC 


CGG 


CGG 


AGG 


TCC 


CGC 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACT 


TGC 


GGC 


TTC 


GCC 


GAC 


CTC 


ATG 


GGA 


TAC 


ATC 


CCG 


GTC 


GTA 


GGC 


GCC 


CCC 


GTG 


GGT 


GGC 


GTC 


GCC 


AGA 


GCC 


CTG 


GCG 


CAT 


GGC 


GTC 


AGG 


CTT 


CTG 


GAG 


GAC 


GGG 


GTC 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTC 


TTG 


GCA 


CTG 


CTC 


TCG 


TGC 


CTG 


ACT 


GTT 


CCC 


GCT 


TCG 


GCC 











156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
573 

10 (2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

39 

20 AAC ACC AAC CGC CGC CCA CAG GAC GTT AAG TTC CCG GGC 78 

117 
156 
195 
234 
273 
312 

-_v.i«rtx ivju ovj*_ uv_u HHi t»AU LXC CGG CGA AAG TCG CGC 351 
iST TT/1 nnrr ajvo r?T>/~> nmi-< <-.-» m » . 390 

429 
468 
507 
546 
573 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTT 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTC 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


TCA 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGG 


CGG 


CGC 


CAG 


CCT 


ATT 


CCC 


AAG 


GCG 


CGC 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA 


AAG 


TCG 


CGC 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCC 


CTT 


GCA 


CAT 


GGT 


GTG 


AGG 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAT 


GCA 


ACG 


GGG 


AAT 


TTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTC 


TCG 


TGC 


CTG 


ACC 


GTC 


CCG 


GCC 


TCT 


GCA 









35 



(2) INFORMATION FOR SEQ ID NO: 14 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SAB 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 14 7: 



ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 3 9 

AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 78 

GGT GGT CAG ATC GTT GGT GGA GTT TAG TTG TTG CCG CGC 117 

AGG GGC CCT AGA TTG GGT GTG CGC GCG ACT CGG AAG ACT 156 

TCA GAA CGG TCG CAA CCC CGT GGG CGG CGC CAG CCT ATT 195 

CCC AAG GCG CGC CAA CCC ACG GGC CGG TCC TGG GGT CAA 234 

CCC GGG TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC 273 

GGG TGG GCA GGG TGG TTG CTC TCC CCC CGA GGC TCT CGG 312 

CCT AAT TGG GGC CCC AAT GAC CCC CGG CGA AAA TCG CGC 3 51 

AAT TTG GGT AAG GTC ATC GAT ACC CTA ACG TGC GGA TTC 3 90 

GCC GAC CTC ATG GGG TAC ATC CCG CTC GTA GGC GGC CCC 42 9 

GTT GGG GGC GTC GCA AGG GCC CTC GCA CAT GGT GTG AGG 4 68 

GTT CTT GAG GAC GGG GTA AAC TAT GCA ACA GGG AAT TTG 5 07 

CCC GGT TGC TCT TTC TCT ATC TTT ATC CTT GCA CTT CTC 54 6 

TCG TGC TTG ACC GTC CCA GCC TCT GCA 573 



(2) INFORMATION FOR SEQ ID NO: 14 8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

25 ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 3 9 

AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 78 

GGT GGT CAG ATC GTT GGT GGA GTT TAC TTG TTG CCG CGC 117 

AGG GGC CCT AGG TTG GGT GTG CGC GCG ACT CGG AAG ACT 156 

TCA GAA CGG TCG CAA CCC CGT GGG CGG CGC CAG CCT ATT 195 

CCC AAG GCG CGC CAA CCC ACG GGC CGG TCC TGG GGT CAA 234 

CCC GGG TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC 273 

30 GGG TGG GCA GGG TGG TTG CTC TCC CCC CGA GGC TCT CGG 312 

CCT AAT TGG GGC CCC AAT GAC CCC CGG CGA AAG TCG CGC 3 51 

AAT TTG GGT AAG GTC ATC GAC ACC CTA ACA TGC GGA TTC 3 90 

GCC GAC CTC ATG GGG TAC ATC CCG CTC GTA GGC GGC CCC 4 29 

GTT GGG GGC GTC GCA AGG GCT CTC GCA CAC GGT GTG AGG 468 

GTT CTT GAG GAC GGG GTA AAT TAC GCA ACA GGG AAT CTG 507 

CCC GGT TGC TCT TTC TCT ATC TTT ATC CTT GCA CTT CTC 546 

TCG TGC CTG ACC GTC CCA GCC TCC GCA 573 
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20 



25 



30 
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(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACC 


AAC 


CTC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


AGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


TCG 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGG 


CGG 


CGC 


CAG 


CCT 


ATT 


CCC 


AAG 


GCG 


CGC 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGG 


AAG 


TCG 


CGC 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAC 


GGT 


GTG 


AGG 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAC 


GCA 


ACA 


GGG 


AAT 


TTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTT 


TCC 


TGT 


CTG 


ATC 


ATC 


CCG 


GCC 


TCT 


GCA 









39 
78 
117 
156 
195 
234 
273 

iuu ««« iuu i±u u't Ttt: tCC CGA GGC TCT CGG 312 

351 
390 
429 
468 
507 
546 
573 



(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150 



ATG AGC ACG AAT CCT AAA CCT CAA AGA AAA ACC AAA AGA 3 9 

AAC ACC AAC CGC CGC CCA CAG GAC GTC AAG TTC CCG GGC 78 

GGT GGT CAG ATC GTT GGT GGA GTT TAC TTG TTG CCG CGC 117 

AGG GGC CCC AGG TTG GGT GTG CGC GCG ACT CGG AAG ACT 156 

TCA GAA CGG TCG CAA CCC CGT GGA CGG CGC CAG CCT ATT 195 

CCC AAG GCT CGC CAG CCC ACG GGC CGG TCC TGG GGT CAA 234 

CCC GGG TAC CCT TGG CCC CTT TAC GCC AAT GAG GGC CTC 273 

GAG TGG GCA GGG TGG TTG CTC TCC CCC CGA GGC TCT CGG 312 
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CCT 


AGT 


TGG 


GGC 


CCC 


AAC 


GAC 


CCC 


CGG 


CGG 


AAA 


TCG 


CGC 


351 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAT 


CTG 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAT 


GGT 


GTG 


AGG 


468 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAC 


GCA 


ACA 


GGG 


AAT 


TTA 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTT 


546 


TCA 


TGC 


CTG 


ACC 


GTC 


CCG 


GCC 


TCT 


GCA 










573 



(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 
10 (C) STRANDEDNESS: . single 

(D) TOPOLOGY: . linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: SA13 



15 



20 



25 



30 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 151: 




ATG 


AGC 


ACG 


AAT 


CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


39 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


78 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


117 


AGG 


GGC 


CCT 


AGG 


TTG 


GGT 


GTG 


CGC 


GCA 


ACT 


CGG 


AAG 


ACT 


156 


TCA 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGA 


CGG 


CGT 


CAG. 


CCT 


ATC 


195 


CCC 


AAG 


GCG 


CGC 


CAG 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


234 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAT 


GCC 


AAT 


GAG 


GGC 


CTC 


273 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


312 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGG 


AAA 


TCG 


CGC 


351 


AAC 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTG 


ACG 


TGC 


GGA 


TTC 


390 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


429 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAC 


GGT 


GTG 


AGG 


468 


GTC 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTA 


507 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTT 


546 


TCA 


TGC 


CTG 


ACT 


GTC 


CCG 


ACC 


TCT 


GCC 










573 
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(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear . 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152 



ATG 


AGC 


ACG 


AAT 


CCT 


AAA 




LAn 


a r" 1 a 


AAA 


ACC 


CAA 


AGA 


AAC 


ACC 


AAC 


CGC 


CGC 


CCA 


CAG 




OIL 




J. 1 l_ 


CCG 


GGC 


GGT 


GGT 


CAG 


ATC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCT 


CGT 


ATG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


TCG 


GAA 


CGG 


TCG 


CAA 


CCC 


CGT 


GGA 


CGG 


CGT 


CAG 


CCT 


ATT 


CCC 


AAG 


GCG 


CGC 


CAA 


TCC 


GCG 


GGT 


CGG 


TCC 


TGG 


GGT 


CAA 


CCC 


GGG 


TAC 


CCT 


TGG 


CCC 


CTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


GGG 


TGG 


GCA 


GGG 


TGG 


TTG 


CTC 


TCC 


CCC 


CGA 


GGC 


TCT 


CGG 


CCT 


AAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA 


AAA 


TCG 


CGC 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


GCC 


GAC 


CTC 


ATG 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


GTT 


GGG 


GGC 


GTC 


GCA 


AGG 


GCT 


CTC 


GCA 


CAC 


GGT 


GTG 


AGG 


GTT 


CTT 


GAG 


GAC 


GGG 


GTA 


AAC 


TAT 


GCA 


ACA 


GGG 


AAT 


TTG 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTT 


GTC 


CTT 


GCA 


CTT 


CTC 


TCG 


TGC 


CTA 


ACC 


GTC 


CCT 


GCC 


TCT 


GCA 





39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
466 
507 
54 6 
573 



(2) INFORMATION FOR SEQ ID NO: 153: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE: SA11 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

39 
78 
117 

25 ™r ooA ™ Xtt ™ "ffl AL ' T CGG AAG ACT 156 

195 
234 
273 
312 
351 
390 

30 GTT GGG GGC GTC GCA AGG GCC CTC GCA CAC GGT GTG Atti til 

507 
546 
573 

(2) INFORMATION FOR SEQ ID NO: 154: 
35 (i) SEQUENCE CHARACTERISTICS: 



CCT 


AAA 


CCT 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


CGC 


CCA 


CAG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGC 


GTT 


GGT 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


TTG 


GGT 


GTG 


CGC 


GCG 


ACT 


CGG 


AAG 


ACT 


CAA 


CCC 


CGT 


GGG 


CGG 


CGT 


CAG 


CCT 


ATT 


CAA 


CCC 


ACG 


GGC 


CGG 


TCC 


TGG 


GGT 


CAA 


TGG 


CCC 


TTT 


TAC 


GCC 


AAT 


GAG 


GGC 


CTC 


TGG 


CTG 


CTC 


TCC 


CCT 


CGA 


GGC 


TCT 


CGG 


CCC 


AAT 


GAC 


CCC 


CGG 


CGA 


AGA 


TCG 


CGC 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGC 


GGA 


TTC 


GGG 


TAC 


ATC 


CCG 


CTC 


GTA 


GGC 


GGC 


CCC 


GCA 


AGG 


GCC 


CTC 


GCA 


CAC 


GGT 


GTG 


AGA 


GGG 


GTA 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


CTT 


TTC 


TCC 


ATC 


TTT 


ATC 


CTT 


GCA 


CTT 


CTC 


GTC 


CCG 


GCC 


ACT 


GCA 
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(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 

15 TTG GGC GGC GTC GCG GCT GCG CTC GCA CAT GGC GTG AGG 468 

507 
54 6 
573 



10 





(Xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 1! 


ATG 


AGC 


ACA 


CTT 


CCA 


AAA 


CCC 


CAA 


AGA 


AAA 


ACC 


AAA 


AGA 


AAC 


ACC 


AAC 


CGT 


CGC 


CCA 


ACG 


GAC 


GTC 


AAG 


TTC 


CCG 


GGT 


GGC 


GGT 


CAG 


ATC 


GTT 


GGC 


GGA 


GTT 


TAC 


TTG 


TTG 


CCG 


CGC 


AGG 


GGC 


CCC 


CGG 


TTG 


GGT 


GTG 


CGC 


GCG 


ACG 


AGA 


AAG 


ACT 


TCC 


GAG 


CGA 


TCC 


CAG 


CCC 


AGA 


GGC 


AGG 


CGC 


CAA 


CCT 


ATA 


CCA 


AAG 


GCG 


CGC 


CAG 


CCC 


CAG 


GGC 


AGG 


CAC 


TGG 


GCT 


CAG 


CCC 


GGA 


TAC 


CCT 


TGG 


CCT 


CTT 


TAT 


GGA 


AAC 


GAG 


GGC 


TGT 


GGG 


TGG 


GCA 


GGT 


TGG 


CTC 


CTG 


TCC 


CCC 


CGC 


GGC 


TCC 


CGG 


CCA 


CAT 


TGG 


GGC 


CCC 


AAT 


GAC 


CCC 


CGG 


CGT 


CGA 


TCC 


CGG 


AAT 


TTG 


GGT 


AAG 


GTC 


ATC 


GAT 


ACC 


CTA 


ACG 


TGT 


GGG 


TTC 


GCC 


GAT 


CTC 


ATG 


GGG 


TAC 


ATT 


CCC 


GTC 


GTG 


GGC 


GCG 


CCT 


TTG 


GGC 


GGC 


GTC 


GCG 


GCT 


GCG 


CTC 


GCA 


CAT 


GGC 


GTG 


AGG 


GCA 


ATC 


GAG 


GAC 


GGG 


ATC 


AAT 


TAT 


GCA 


ACA 


GGG 


AAT 


CTC 


CCC 


GGT 


TGC 


TCT 


TTC 


TCT 


ATC 


TTC 


CTT 


TTG 


GCA 


CTA 


CTC 


TCG 


TGC 


CTC 


ACA 


ACG 


CCA 


GCT 


TCG 


GCT 











(2) INFORMATION FOR SEQ. ID NO: 155: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK7 



30 







(xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 155: 


Met 


Ser 


Thr 


Asn Pro Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 








10 








Gly 


Thr 


Asn 


Arg 


Arg Pro Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


15 




20 










25 








Gin 


He 


Val 


Gly Gly Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 




35 










40 






Arg 


Leu 


Gly 


Val Arg Ala 


Pro 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 






50 










55 




Gin 


Pro 


Arg 


Gly Arg Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 






60 






65 










70 
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5 



Pro 




oly my 


l nr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 








75 










80 








JJC u 


iyr 




vjJLU 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


O 3 








90 










95 






C A V* 


riO 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




inn 








105 










110 






^ y 


Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


lie 


Asp 


Thr 


Leu 


Thr 




115 








120 








125 




Cys 


Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 


Gly 




130 










135 








140 


Ala 


Pro Leu 


Gly 


Gly Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 


Val 






145 










150 








Arg 


Val Leu 


Glu 


Asp 


Gly Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 








160 








165 








Leu 


Pro 


Gly Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 








175 










180 




Ser 


Cys 


Leu Thr 


Val 


Pro 


Ala 


Ser 


Ala 













185 190 



(2) INFORMATION FOR SEQ ID NO: 156: 

|5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
20 (C) INDIVIDUAL ISOLATE.: US11 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 156: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 

15 20 25 

Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
30 35 40 

Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser 

45 50 55 

Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg Arg 
60 65 70 

Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro 
30 75 80 

Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu 

85 90 95 

Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Thr Asp Pro 

100 105 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu 

115 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val 
35 130 135 140 



25 
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Gly Ala Pro Leu Gly Gly Ala 

145 

Val Arg Val Leu Glu Asp Gly 
155 160 
Leu Pro Gly Cys Ser Phe Ser 
170 175 
Ser Cys Leu Thr Val Pro Ala 
185 



Ala Arg Ala Leu Ala His Gly 
150 

Val Asn Tyr Ala Thr Gly Asn 
165 

lie Phe Leu Leu Ala Leu Leu 

180 

Ser Ala 
190 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
10 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 



15 



30 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 157: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










4 0 






Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








60 










65 










70 


Pro 


Glu 


Gly 


Arg 


Thr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 










Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 
















185 










190 
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20 
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(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 



Met 
1 


Ser 


i nr 


Asn 


Pro 


Lys 


Pro 


Thr 


Asn 


Arg 


Arg 


5 

Pro 


Gin 


Asp 


15 










20 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 




30 










35 


Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 






45 










Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 








60 






Pro 


Glu 


Gly 


Arg 


Thr 


Trp 


Ala 










75 






Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


85 










90 


Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 




100 








105 


Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 






115 










Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 








130 






Gly Ala 


Pro 


Leu 


Gly 


Gly 


Ala 










145 






Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


155 










160 


Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 




170 










175 


Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 



10 

Val Lys Phe Pro Gly Gly Gly 
25 

Leu Leu Pro Arg Arg Gly Pro 

40 

Arg Lys Thr Ser Glu Arg Ser 

50 55 
Pro lie Pro Lys Ala Arg Arg 
65 70 
Gin Pro Gly Tyr Pro Trp Pro 
80 

Gly Trp Ala Gly Trp Leu Leu 
95 

Ser Trp Gly Pro Thr Asp Pro 

110 

Gly Lys Val lie Asp Thr Leu 
120 125 
Met Gly Tyr lie Pro Leu Val 
135 140 
Ala Arg Ala Leu Ala His Gly 
25 145 150 

17=1 * "" 1 T " L l Asn Tyr Ala Thr Gly Asn 

165 

e Phe Leu Leu Ala Leu Leu 
180 

•■r Ala 

185 190 



30 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
<D) TOPOLOGY: unknown 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiene 
(C) INDIVIDUAL ISOLATE: S18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 



5 



20 



Met 


Ser 


i nr 


Asn 


rIO 


T 

juys 




Olil 


Arg 


T A/Q 


Thr 


Lys 


Aro 


Asn 


J. 








c 










10 










Tnr 


Asn 


Arg 


Arg 


Pro 


uin 


Asp 


Vox 




Php 


rxu 


Glv 


Glv 


Glv 


15 
















25 








Gin 


lie 


vai 


vjiy 


c?±y 


vax 


Tyr 


Leu 


Leu 


rro 


Arg 


/*xy 


v*xy 


Pro 




30 


















40 






Arg 


Lteu 


Ls±y 


vai 


Arg 


TV 1 -j 


j. nr 


Arg 


T e 

Jjy s 


Thr 


C ^ y 




**x y 


Ser 




45 




















55 




Gin 


Pro 


Arg 


dxy 
60 


Arg 


Arg 




rro 


Tip 
X XC 

65 




XJ jr O 


Ala 




Aro 

*\x y 

70 


Pro 


Glu 


Gly 


Arg 


Thr 
75 


Trp 


Ala 


Gin 


Pro 


Gly 
80 


Tyr 


Pro 


Trp 


Pro 


Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 
100 


Arg 


Gly 


Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly 


Pro 


Thr 
110 


Asp 


Pro 


Arg 


Arg 


Arg 
115 


Ser 


Arg 


Asn 


Leu 


Gly 
120 


Lys 


Val 


He 


Asp 


Thr 
125 


Leu 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 










135 










14 0 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


lie 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 
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(2) INFORMATION FOR SEQ ID NO: 160: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

35 
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5 



15 



1 11X 


A c n 


Arg 


Arg 


Pro Gin 


ASp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


lb 








20 










25 


oin 


lie 


Val 


Gly 


Gly Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




3 0 








35 








40 




Arcf 


Leu 


Gly 


Val 


Arg Ala 


Tnr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 








50 










55 




riu 


Arg 


Gly Arg Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








60 








65 








70 


Pro 


Glu 


Gly Arg 


Thr Trp 
75 


Ala 


Gin 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 


Leu 


Tyr 


Gly 


Asn 


Glu Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90. 










95 






Ser 


Pro 


Arg 


Gly 


Ser Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 








;105 








110 




Arg 


Arg 


Arg 


Ser 


Arg Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Thr 




115 








120 






125 




Cys 


Gly 


Phe 


Ala Asp 


Leu 


Met Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 








135 








140 


Gly Ala 


Pro 


Leu 


Gly Gly Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 








150 








Val 


Arg 


Val 


Leu 


Glu Asp 


Gly Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 








160 








165 






Leu 


Pro 


Gly 


Cys 


Ser Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 








175 










180 






Ser 


Cys 


Leu 


Thr 


Val Pro 


Ala 


Ser 


Ala 
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(2) INFORMATION FOR SEQ ID NO: 161: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

2 S (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161 



30 



35 



Met 


Ser 


Thr Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


1 






5 








10 




Thr 


Asn 


Arg Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


15 








20 








25 


Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 




30 








35 








Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 






45 








50 






Gin 


Pro 


Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 






60 










65 




Pro 


Glu 


Gly Arg 


Thr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 








75 










80 



40 

31u Arg Ser 
55 

U.a Arg Gin 
70 



SUBSTITUTE SHEET (RULE 26) 
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o 



5 



Leu 


Tyr 


Gly 


Asn 


GlU 


Gly 


Lieu 


Cjxy 


Trp 


ax a 




irp 


Leu 


Leu 


85 






y u 










QC 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


irp 


biy 


Xr X vj 


X Hi. 


Asp 


Pro 




100 
















J. J. w 






Arg 


Arg 


Arg 


Ser 




Asn 


Leu 




Lys 


VdX 


Tip 


erv 


Thr 


Leu 




115 










120 










12 5 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 






130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys 


Pro 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 














185 










190 















(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

20 



25 



30 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 162: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Ala 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 






60 










65 










70 


Pro 


Glu 


Gly 


Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


His 


Pro 


Trp 


Pro 








75 










80 










Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 




115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 










135 










140 



35 
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Gly Ala Pro Leu Gly Gly Ala 

145 

Val Arg Val Leu Glu Asp Gly 
155 160 
Leu Pro Gly Cys Ser Phe Ser 
170 175 
Ser Cys Leu Thr lie Pro Ala 
185 



190 - 

Ala Arg Ala Leu Ala His Gly 
150 

Val Asn Tyr Ala Thr Gly Asn 
165 

lie Phe Leu Leu Ala Leu Leu 

180 

Ser Ala 
190 



(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: Dl 

15 



20 



25 



30 







(xi) 




SEQUENCE DESCRIPTION: i 


SEQ 


ID NO: 


163 : 


Met 
1 


Ser 


Thr 


Asn 


Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


10 
Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 








40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 


Gin 




45 










50 










55 




Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 




Glu 




60 










65 








70 


Pro 


Gly 


Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 






Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 








95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 






110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 








125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 








140 


Gly Ala 


Pro 


Leu 


Gly 


Gly Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 








165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 






Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 
















185 










190 















35 
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15 



20 



25 



- 191 



(2) INFORMATION FOR SEQ ID NO: 164: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 





* (xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 164: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










u. nr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 












20 










25 








Gin 


lie 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








60 










65 










70 


Pro 


Glu 


Gly Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 










Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Met 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










14 5. 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 














185 










190 















30 (2) INFORMATION FOR SEQ ID NO: 165: 

(ij SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

35 (vi) ORIGINAL SOURCE: 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WQ 9605315*3 IA> 



WO 96/05315 



PCT/US95/10398 



- 192 - 

o 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P10 



5 



15 







(xi) 




crnT 


Trvrr< 


1*16 L- 


Car 


Thr 


Asn 


rlo 


_ 

Lys 


T 








D 


i nr 


Asn 


Arg 


Arg 


Pro 


Gin 












z u 




±j.e 


Val 


Gly 


Gly 


val 














Arg 


Leu 


Gly 


Val 


Arg 


Ala 






45 








\jXTi 


Pro 


Arg 


Gly 


Arg 


Arg 








60 






Pro 


GlU 


Gly Arg 


Ala 


Trp 










75 




L«eu 


Tyr 


Gly Asn 


Glu 


Gly 


85 










90 


Ser 


Pro 


Arg 


Gly 


Ser 


Arg 




100 










Arg 


Arg 


Arg 


Ser 


Arg 


Asn 






115 








Thr 


Cys 


Gly 


Phe 


Ala 


Asp 








130 






Gly 


Ala 


Pro 


Leu 


Gly 


Gly 










145 




Val 


Arg 


Val 


Leu 


Glu 


Asp 


155 










160 


Leu 


Pro 


Gly 


Cys 


Ser 


Phe 




170 










Ser 


Cys 


Leu 


Thr 


He 


Pro 






185 









DESCRIPTION: SEQ ID NO: 165: 



riO 


bin 


Arg 


Lys 


x nr 


Lys 


Arg 


Asn 








t n 
x u 










Asp 


vax 


Lys 


r*ne 


Pro 


Gly 


Gly 


Gly 










£ D 








Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 












40 






i nr 


Arg 


Lys 


Tnr 


Ser 


Glu 


Arg 


Ser 




50 










55 




Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 






65 










70 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 








80 










Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 










95 








Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 


105 










110 






Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 




120 










125 




Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 






135 










140 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








150 








Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 










165 








Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 


175 










180 






Ala 


Ser 


Ala 













190 



25 (2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 



35 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 
15 10 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/10398' 



- 193 - 



5 



15 



Tnr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


vai 


Lys 


pne 


Pro 


LaXy 




p i 

\3±y 


15 










20 










25 








Gin 


He 

30 


Val 


Gly 


Gly 


Val 


Tyr 
35 


Leu 


Leu 


Pro 


Arg 


Arg 

4 0 


Gly 


Pro 


Arg 


Leu 


Gly 
45 


Val 


Arg 


Ala 


Thr 


Arg 
50 


Lys 


Thr 


Ser 


Glu 


Arg 

55 


Ser 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








60 










65 










70 


Pro 




vjj.y 


Arg 


Ala 
75 


Trp 


Ala 


pi ti 
bin 


Pro 


80 


Tyr 


riO 


irp 


fro 


Leu 


Tyr 


oJ.y 


Asn 


Glu 


Gly Met 


Gly 


Trp 


ax a 


pi , T 


Trp 


Leu 


Leu 


85 










90 










9b 








Ser 


Pro 
100 


Arg 


Gly 


Ser 


Arg 


Pro 
105 


Ser 


Trp 


Gly 


Pro 


Asn 

110 


Asp 


Pro 


Arg 


Arg 


Arg 
115 


Ser 


Arg 


Asn 


Leu 


Gly 
120 


Lys 


Val 


He 


Asp 


Thr 
125 


Leu 


Thr 


Cys 


Gly 


Phe 
130 


Ala 


Asp 


Leu 


Met 


Gly 
i35 


Tyr 


He 


Pro 


Leu 


Val 
140 


Gly 


Ala 


Pro 


Leu 


Gly Gly Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 










165 








Leu 


Pro 
170 


Gly 


Cys 


Ser 


Phe 


Ser 
175 


He 


Phe 


Leu 


Leu 


Ala 
160 


Leu 


Leu 


Ser 


Cys 


Leu 
185 


Thr 


He 


Pro 


Ala 


Ser 
190 


Ala 













(2) INFORMATION FOR SEQ ID NO: 167: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: homosapiens 
25 (C) INDIVIDUAL ISOLATE: T10 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: 



Met 


Ser 


Thr Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 






5 










10 










Thr 


Asn 


Arg Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 










25 








Gin 


He 


Val Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 








35 










40 






Arg 


Leu 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 








50 










55 




Gin 


Pro 


Arg Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 






60 










65 










70 


Pro 


Glu 


Gly Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 



35 75 80 



SUBSTITUTE SHEET (RULE 26) 



WO96/05315 



PCT/US95/10398 



- 194 - 

o 



Leu 


Tyr 


Gly Asn 


Glu 


Gly 


Met 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Thr 




115 










120 








125 




Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 








140 


Gly Ala 


Pro 


Leu 


Gly Gly Ala Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 


Val 








145 










150 








Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 




Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 











185 190 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

ic (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 

Met Ser Thr 
1 

Thr Asn Arg 
15 

- Gin lie Val 

30 

Arg Leu Gly 
45 



30 Leu 
85 



100 

Arg Arg 
115 



35 



Asn 


Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


Gly 




20 










25 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 


Val 






35 










40 




Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 










50 










55 




Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 


60 










65 








70 


Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 




75 










80 








Asn 


Glu 


Gly 


Met 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


Gly 




90 










95 






Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 








105 










110 




Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Phe 


Ala 






120 








125 




Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 


130 










135 










140 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/USS5/10398 



- 195 - 

Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr lie Pro Ala Ser Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO : 16 9: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND3 



15 



20 



30 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 169: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val' 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 






60 










65 










70 


Pro 


Glu 


Gly 


Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 








75 










80 










Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 








105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


lie 


Asp 


Thr 


Leu 


115 










120 










125 


Val 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 




130 










135 










14 0 


Gly Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 














185 










190 















35 



10 



(i) 



ONSDOCIO- <-WO 9sr«.*»i5fl'' ' A- > 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/I0398 



- 196 - 

o 

(2) INFORMATION FOR SEQ ID NO: 170: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
e (D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 



15 



25 







V aj. i 




SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 


170 : 


Met 
1 


Ser 




Asn 


Pro 

5 


Lys 


Pro 


Gin 


Arq 


Lvs 


Thr 


Lys 


Arg 


Asn 


Thr 


Asn 




Arg 


Pro 


Gin 


Asp 


Val 


' Lys 


10 
Phe 


Pro 


Glv 


Gly 


Gly 


15 










20 










25 


Gin 


He 




Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 








40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 


Gin 




45 










50 










55 




Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 




Glu 




60 










65 








70 


Pro 


Gly 


Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


His 


Pro 


Trp 


Pro 










75 










80 








Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 






. 110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Thr 




115 










120 








125 




Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 








140 


Gly Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 


Val 








145 










150 








Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 








165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 




Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 
















185 










190 















30 (2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

35 (vi) ORIGINAL SOURCE: 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



197 



10 



15 



20 



(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171 



Met. 


C -v 


Thr 


Asn 




T 


IT JL\J 








Thr 


Lys 


Arg 


Asn 


1 








tr 




















1 TIT 


Asn 


Arg 


Arg 


Pro 


bin 


7A en 


V CLX 






* J. w 


Glv 


Gly Gly 


15 








20 










^ D 








Gin 


He 


Val 


Gly 




vax 


Tyr 


Xj€U 


T .01 1 
Lieu 


XT X 




Zi y rr 
avx y 


Gly 


Pro 




30 








3 D 










A n 






Arg 


Leu 


Gly Val 


Arg 


Til o 

ax a 


i nr 


Arg 


jjys 


i. XiX 


Car 

OCX 


ul U 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


lie 


Pro 


Lys 


Ax a 


Arg 


His 








60 










6 1> 










70 


Pro 


Glu 


Gly 


Arg 


TV 1 -» 

Ala 


Trp 


Ala 


Gin 


Pro 


Vjxy 


ryr 


XT I U 


Trp 


Pro 








75 




















Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 






130 










135 










14 0 


Gly Ala 


Pro 


Leu 


Gly 


Gly 


Ala 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 








180 






Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 














185 










190 















(2) INFORMATION FOR SEQ ID NO: 172: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 172: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 
35 15 20 25 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID: <WO 9605315*3 IA> 



WO 96/05315 . 



PCT/US95/10398 



O 



5 



15 



\J -1. J, 1 


Tip 

M J. C 


Val 


Gly 


Gly Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 








40 






T i^i i 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


TV* %~ 

i nr 


Ser 


Ci-LU 


Arg 


Ser 






45 




















DO 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 




lie 


Pro 


Lys 


ax a 


Arg 


Qiln 








60 










65 








70 


Pro 


Glu 


Gly Arg 


Thr 


Trp 


Ala 


Gin 


Pro 


oxy 


Tyr 


Pro 


Trp 


Pro 










75 










80 






JJC u 


Ayr 


Gly Asn 


Glu 


Gly Met 


Gly 


Trp 


7\ 1 -a 

Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






OCX 


XT X. \J 


Arg 


Gly 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




ion 
a. \j \j 










105 










110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


lie 


Asp 


Thr 


Leu 






115 










120 








125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 








140 


Gly 


Ala 


Pro 


Leu 


Gly Gly Val 


Ala Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 








165 






Leu 


Pro 
170 


Gly 


Cys 


Ser 


Phe 


Ser 
175 


He 


Phe 


Leu 


Leu 


Ala 
180 


Leu 


Leu 


Ser 


Cys 


Leu 


Thr 


Thr 


Pro 


Ala 


Ser 


Ala 













185 190 



(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 
15 20 25 

30 Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
30 35 40 

Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser 

45 50 55 

Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg Arg 
60 65 70 

Pro Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro 
35 75 . 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCI7US95/10398 



199 



25 



30 



Leu 


Tyr 


OX jr 




Glu 


Glv 


Met 


Glv 


TrD 


Ala 


Gly 


Trp 


Leu 


Leu 


















95 








Cay* 


.TX \J 


His 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




i no 








105 










110 








Arg 


Axy 


C a >- 
X 




Asn 


Leu 


Glv 


Lvs 


Val 


He 


Asp 


Thr 


Leu 


X X 3 










120 










125 


Val 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 




130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








14 5 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 








160 










;165 








lie 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


Thr 


Pro 


Val 


Ser 


Ala 














185 










190 















10 

(2) INFORMATION FOR SEQ ID NO: 174: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

15 ( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK4 

20 



35 





(xi) 




SEQUENCE 


I DESCRIPTION: SEQ ID NO: 174: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


1 

Thr 


Asn 


Arg 


Arg 


5 

Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 






20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 








35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 


Gin 


Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 






60 










65 










70 


Pro 


Glu 


Gly 


Arg 


Thr 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 








75 










80 










Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Met 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 








105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly Tyr 


He 


Pro 


Leu 


Val 




130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 











SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCT/US95/10398 



- 200 - 



Val Arg Val Val Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 

155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr lie Pro Ala Ser Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P8 



15 



20 



25 



30 







(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ID NO: 


175 : 


Met 


Ser 


Thr 


Thr 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 








Thr 


Ser 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 








40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










. 50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








60 










65 








70 


Pro 


Glu 


Gly Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


His 


Pro 


Trp 


Pro 










75 










80 








Leu 


Tyr 


Ala 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 








125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 










140 


Gly 


Gly 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Val 


Val 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 








165 






Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 






Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 
















185 










190 















35 (2) INFORMATION FOR SEQ ID NO: 176 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/10398 



10 



15 



20 



25 



- 201 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 







(xi) 




-SEQUENCE DESCRIPTION: SEQ ID NO: 176: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arq 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


lie 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








60 










65 










70 


Pro 


Glu 


Gly 


Arg 


Ala 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 










Leu 


Tyr 


Gly 


Asp 


Glu 


Gly 


Met 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 






Ser 


Cys 


Leu 


Thr 


He 


Pro 


Ala 


Ser 


Ala 
















185 










190 















(2) INFORMATION FOR SEQ ID NO: 177: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: homosapiens 



SUBSTITUTE SHEET (RULE 26) 

BNSDOCID- <WO QFfV^ISA' 1 IA> 



WO 96/05315 



PCT/US9S/10398 



- 202 - 



(C) INDIVIDUAL ISOLATE: T4 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



5 



15 



Met 


|J w X 


1 I1X 


Asn 


Pro 




rlU 


vjin 


Arg 


Lys 


l nr 


Lys 


Arg 


Asn 


1 








5 










x u 










1 11X 


A o n 
noil 




Arg 


Pro 


bin 


Asp 


vai 


Lys 


Fne 


Pro 


Gly 


Gly Gly 












^ KJ 










^ b 








m t-i 


lie 


Val 


Gly 


Gly 


vai 


Tyr 


Leu 


Leu 


Pro 


A 

Arg 


Arg 


Gly 


Pro 














_ — 










4 0 




A *m 


JjcU 


vj±y 


Val 


Arg 


ax a 


x nr 


Arg 


Lys 


Tnr 


Ser 


Glu 


Arg 
55 


Ser 




Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg 


Arg 








60 










65 








70 


Ser 


i nr 


Cs-Ly 


Lys 


Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 








Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 




Arg 


His 


Arg 


Ser 


Arg 


Asn 


Val 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 








125 




Thr 


Cys 


Ser 


Leu 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


Val 


Pro 


Val 


Val 








130 










135 








140 


Gly 


Gly 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 








165 






Leu 


Pro 
170 


Gly 


Cys 


Ser 


Phe 


Ser 
175 


He 


Phe 


Leu 


Leu 


Ala 
180 


Leu 


Leu 


Ser 


Cys 


He 


Thr 


He 


Pro 


Val 


Ser 


Ala 













185 190 



(2) INFORMATION FOR SEQ ID NO: 178: 

? - (i) SEQUENCE CHARACTERISTICS: 

^ (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 .(C) INDIVIDUAL ISOLATE: US10 

(xi). SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

15 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 

35 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/10398 



- 203 - 



5 



Gin 


lie 


Veil 


Gly 


Gly Val 


ryr 


T .D1 1 

JjcU 


T .01 1 
XJtZU 


XT X 


^ x y 


At*ct 


Glv 


Pro 




30 








35 










a n 
^ u 






Arg 


Leu 


vaxy 


Val 


Arg 


Ala 


x rii 


niy 


Jjyo 


Thr 


Cpr 
OCX 


V7X u 




Ser 




4 5 










c: n 
D U 














(jyin 


Pro 


Arg 


Gly Arg 


Arg 


m n 


Pro 


Tip 
lie 




uy 0 




Arg 


Arg 






60 




















/ u 


PrO 




vjiy 


Lys 


Ser 


Trp 


Lj-L y 


T ■» ^ 

LyS 


Pro 






a X w 




Pro 










75 










oU 










Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


irp 


nla 


uiy 


irp 


Leu 


Leu 


85 








90 


















Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


x nr 


Asp 


Pro 




100 










105 










Tin 

11 U 






Arg 


ni s 


Arg 


Ser 


Arg 


Asn 


Val 


Gly 


T ,VB 
XJ jr 0 


Val 

V Q X 


lie 

X X c 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 


Val 








130 










135 










140 


Gly 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 








150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 








175 










180 






Ser 


Cys 


He 


Thr 


He 


Pro 


Val 


Ser 


Ala 













(2) INFORMATION FOR SEQ ID NO: 179: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homos api ens 
(C) INDIVIDUAL ISOLATE: T9 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 



30 



Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


He 


Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Thr 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg 


Arg 






60 










65 










70 


Ser 


Thr 


Gly 


Lys 


Ser 


Tip 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 



75 80 
35 



BNSDOC1D- <WO o^o?*>i e ;i'» ia> 



SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 



PCTAJS95/10398 



- 204 - 

° Leu Tyr Gly Asn Glu Gly Leu Gly Trp Ala Gly Trp Leu Leu 
85 90 95 

Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Ser Asp Pro 

100 105 110 

Arg His Arg Ser Arg Asn Val Gly Lys Val lie Asp Thr Leu 

115 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val 
5 130 135 140 

Gly Ala Pro Leu Gly Gly Val Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys lie Thr Thr Pro Ala Ser Ala 
10 185 190 

(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 



15 



20 



25 



30 



35 







(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ID NO: 180 : 


Met 
1 


Ser 


Thr 


He 


Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


10 
Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 








40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 


Gin 




45 










50 










55 




Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg 


Arg 




Thr 




60 










65 








70 


Ser 


Gly 


Lys 


Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 








Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 




Arg 


His 


Arg 


Ser 


Arg 


Asn 


Val 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Thr 




115 










120 








125 




Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 


Val 


Gly 






130 










135 










140 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










14 5 










150 









SUBSTITUTE SHEET (RULE 26) 



WO 96/05315 " PCT/US95/10398 



- 205 - 



Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 

155 160. 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

.170 175 180 

Ser Cys lie Thr lie Pro Val Ser Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T8 



15 



20 



25 



30 







(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 181: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


lie 


Pro 


Lys 


Asp 


Arg 


Arg 








60 










65 










70 


Ser 


Thr 


Gly 


Lys 


Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 










Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Thr 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 










110 






Arg 


His 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly Arg 


Val 


He 


Asp 


Thr 


He 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 


Val 








130 










135 










140 


Gly Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Phe 


Thr 


Val 


Pro 


Val 


Ser 


Ala 














185 










190 















35 
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(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknovm 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 



15 



25 



Met 
1 


Ser 


i nr 


Asn 


Pro 

5 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


Thr 


Asn 




Arg 


Pro 




Asp 


vax 


Lys 


10 
Pne 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 








40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg 


Arg 








60 










65 






70 


Ser 


Thr 


Gly 


Lys 


Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 






Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Thr 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 










105 








110 




Arg 


His 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


He 






115 










120 








125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 


Val 








130 










135 








140 


Gly Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 








165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 






Ser 


Cys 


Ala 


Thr 


Val 


Pro 


Val 


Ser 


Ala 













185 190 



(2) INFORMATION FOR SEQ ID NO: 183: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: homosapiens 
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10 



15 



- 207 



(C) INDIVIDUAL ISOLATE : DK11 







1X1 ) 




SEQUENCE DESCRIPTION : SEQ : 


ID NO: 183 : 




Ser 


inr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


J. 








5 










10 








Tnr 


Asn 


TV 

Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


XD 










20 










25 






Gin 


lie 


val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 














35 










40 




Arg 


Leu 


Gly 


Val 


Arg 


Thr 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 
















50 










55 


Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg Arg 








60 










65 








70 


Ser 


ml, „ 

Tnr 


Gly 


Lys 


Pro 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 










75 










80 








Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


o c 
ob 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


His 


Pro 


Asn 


Trp 


Gly 


Pro 


Thr 


Asp Pro 




100 










105 










110 




Arg 


His 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr He 






115 










120 










125 . 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val Val 








130 










135 








140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His Gly 










145 










150 








Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 










165 






Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu Leu 




170 










175 










180 




Ser 


Cys 


Cys 


Thr 


Val 


Pro 


Val 


Ser 


Ala 














185 










190 













20 



(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) . LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 (C) INDIVIDUAL ISOLATE: SW3 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 
15 .20 25 

35 
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° Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
30 35 40 

Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser 

45 50 55 

Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Asp Arg Arg 
60 65 70 

Ser Thr Gly Lys Ser Trp Gly Lys Pro Gly Tyr Pro Trp Pro 
5 75 80 

Leu Tyr Gly Asn Glu Gly Cys Gly Trp Ala Gly Trp Leu Leu 

85 90 95 

Ser Pro Arg Gly Ser His Pro Asn Trp Gly Pro Thr Asp Pro 

100 105 110 

Arg His Arg Ser Arg Asn Leu Gly Lys Val He Asp Thr He 

115 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Val Val 
10 130 135 140 

Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly He Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Phe Thr Val Pro Val Ser Ala 
185 190 



15 



(2) INFORMATION FOR SEQ ID NO: 185: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 

25 



30 



35 







(xi) 


SEQUENCE DESCRIPTION: SEQ : 


ID NO: 185 : 


Met 
1 


Ser 


Thr 


Asn Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg Asn 


Thr 


Asn 


Arg 


Arg Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 








20 










25 


Gin 


lie 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 








35 










40 


Arg 


Leu 


Gly 


Val Arg 


Ala 


Thr 


Arg 


Lys 


Ser 


Ser 


Glu 


Arg Ser 


Gin 




45 








50 










55 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


lie 


Pro 


Lys 


Asp 


Arg Arg 








60 








65 






70 


Ser 


Thr 


Gly 


Lys Ser 


Trp 


Gly 


Lys 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 








75 










80 






Leu 


Tyr 


Gly 


Asn Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 








90 










95 
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10 



20 



25 



30 



Ser Pro Arg Gly Ser Arg Pro Thr Trp Gly Pro Thr Asp Pro 

100 105 HO 

Arg His Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr lie 

115 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Val Val 
130 135 140 

Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

. 170 175 180 

Ser Cys Cys Thr Val Pro Val Ser Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 186: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : ' unknown 
15 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: S83 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 



35 



Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


1 

Thr 


Asn 


Arg 


Arg 


5 

Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 






20 










25 








Gin 


He 


Val 


Gly Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Asp 


Arg 


Arg 






60 










65 










70 


Thr 


Thr 


Gly 


Lys 


Ser 


Trp 


Gly 


Arg 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 






75 










80 










Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 






90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Thr 


Asp 


Pro 




100 








105 










110 






Arg 


His 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 




115 










120 










125 


Val 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 




130 










135 










140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 
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Val Arg Val Leu Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn 

155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys. lie Ser Val Pro Val Ser Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 187: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK10 



15 



20 



25 



30 







(xi) 




SEQUENCE DESCRIPTION: SEQ : 


ID NO: 


187 : 


Met 
1 


Ser 


Thr 


Leu 


Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


Thr 


He 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


10 
Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Val 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 




Glu 




60 










65 






70 


Ser 


Gly 


Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 






Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 








110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Thr 




115 










120 








125 




Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 


Gly 






130 










135 








140 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Ala 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Phe 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Phe 




170 










175 










180 






Ser 


Cys 


Leu 


He 


His 


Pro 


Ala 


Ala 


Ser 
















185 










190 
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(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S52 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 188: 


Met 


Ser 


Thr 


Leu 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










Thr 


He 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Val 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 








35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








60 










65 










70 


Ser 


Glu 


Gly 


Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 










Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 










135 










140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 










Val 


Arg 


Ala 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Phe 


Ala 


Thr 


Gly 


Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Phe 




170 






175 










180 






Ser 


Cys 


Leu 


Val 


His 


Pro 


Ala 


Ala 


Ser 














185 










190 















(2) INFORMATION FOR SEQ ID NO: 189: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: homosapiens 
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o 

(C) INDIVIDUAL ISOLATE: S2 



5 



15 







(xi) 




SEQUENCE DESCRIPTION: 


SEQ 


ID NO: 


1 Q Q . 

J. o ;7 : 


Met 
1 


Ser 


Thr 


Leu 


Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


noil 


Thr 


He 


Arq 


Arg 


Pro 


Gin 


Asp 


He 


Lys 


Phe 


Pro 


Gly 


Gly 


OX y 


15 










20 










25 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Val 


Leu 


Pro 


Arg Arg 


Gly 


Pro 




30 










35 










40 




Arg 


Leu 


Glv 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arq 


Ser 


Gin 




45 










50 










55 




Pro 


A y*ct 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 




Glu 




60 










65 








70 


Ser 


Glv 


Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly Tyr 


Pro 


Trp 


Pro 










75 










80 








Leu 


Tvr 

j 


Glv 


Asn 


Glu 


Gly 


Cys 


Gly Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arrr 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 








125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 


Gly 






130 










135 








140 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Ala 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Phe 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Phe 




170 










175 










180 






Ser 


Cys 


Leu 


He 


His 


Pro 


Ala 


Ala 


Ser 
















185 










190 















(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
30 <C) INDIVIDUAL ISOLATE: DK12 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 190: 

Met Ser Thr Leu Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 .10 

Thr lie Arg Arg Pro Gin Asp Val Lys Phe Pro Glv Glv Glv 

35 " 20 25 
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10 



20 



25 



30 



Gin 


lie 


Val 


Gly 


oi y 


vox 


l yx 


Val 


Leu 


Pro 


Axq 


Arg 


Gly 


Pro 




30 
















40 




Ser 


Arg 


Leu 


Gly 


Val 




Ala 


Thr 


Ara 


Lvs 


Thr 


Ser 


Glu 


Arg 




45 




















55 




Gin 


Pro 


Arg 


Gly 


nig 


A rrr 


Gin 


pro 


He 


Pro 


Lvs 


Ala 


Arg 


Arg 






60 




















70 


Ser 


blU 


Gly 


Arg 


Co t" 
OCX 




Ala 


Gin 


Pro 


Gly 


Tvr 


Pro 


Trp 


Pro 






7b 




















Leu 


Tyr 


Gly Asn 


ulU 




v».y o 


Glv 


Trn 


Ala 


Glv 


Trp 


Leu 


Leu 


85 


























Ser 


Pro 


Arg 


Gly 


Ser 


Axy 


ir X 


C A >- 

X 




Glv 


Pro 


Asn 


ASD 


Pro 




100 
















110 








A Y*CI 


Arg 


Ser 


Ara 


Asn 


Leu 


Gly 


Lys 


Val 


lie 


Asp 


Thr 


Leu 


115 










120 










125 


Val 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 




130 










135 










140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








14 5 








150 










Val 


Arg 


Ala 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Phe 


Ala 


Thr 


Gly 


Asn 


155 








160 










165 






Phe 


Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


He 


His 


Pro 


Ala 


Ala 


Ser 













15 185 



(2) INFORMATION FOR SEQ ID NO : 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNE S S : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 



35 







(xi) 




SEQUENCE 


1 DESCRIPTION: SEQ ID NO: 191: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 








10 










Thr 


Asn 


Arg 


Arg 


Pro 


Met 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 






20 










25 




Gly 


Pro 


Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 




30 






35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 


Gin 


Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 






60 










65 










70 


Pro 


Glu 


Gly 


Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 






75 










80 










Leu 


Tyr Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 
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10 



15 



20 



25 



30 



Ser Pro Arg Gly Ser Arg Pro Ser Trp Gly Pro Asn Asp Pro 

100 105. 110 

Arg Arg Arg Ser Arg Asn Leu Gly Lys Val lie Asp Thr Leu 

115 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro lie Val 
130 135 140 

Gly Ala Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Ala Val Glu Asp Gly lie Asn Tyr Ala Thr Gly Asn 
155 160 165 

Leu Pro Gly Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z8 

SEQUENCE DESCRIPTION: SEQ ID NO: 192: 







(xi) 




Met 

1 


Ser 


Thr 


Asn 


Thr 


Asn 


Arg 


Arg 


15 








Gin 


He 


Val 


Gly 




30 






Arg 


Leu 


Gly 


Val 






45 




Gin 


Pro 


Arg 


Gly 








60 


Ser 


Glu 


Gly 


Arg 


Leu 


Tyr 


Gly 


Asn 


85 








Ser 


Pro 


Arg 


Gly 




100 






Arg 


Arg 


Arg 


Ser 






115 




Thr 


Cys 


Gly 


Phe 








130 


Gly 


Ala 


Pro 


Val 



75 



Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


Met 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


20 










25 






Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




35 










40 




Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






50 










55 




Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Arg 








65 










70 


Trp 


Ala 


Gin 


Pro 


Gly 
80 


Tyr 


Pro 


Trp 


Pro 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


. 90 










95 






Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




105 










110 




Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






120 








125 




Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








135 










140 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 



35 



145 . . 150 
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Val Arg Ala Val Glu Asp Gly He Asn Tyr Ala Thr Gly Asn 

155 160 165 

Leu Pro Gly Cys Ser Phe Ser He Phe Leu Leu Ala Leu Leu 

170 175 1B0 

Ser Cys Leu Thr Val Pro Ala Ser Ala 

185 190 



D (2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

10 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 21 



20 



30 





<xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 193: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg Asn 


1 








5 










10 








Thr 


Asn 


Arg 


Arg 


Pro 


Met 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly Gly 


15 










20 










25 






Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly Pro 




30 








35 










40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Ala 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg Ser 




45 










50 










55 


Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg Arg 






60 










65 








70 


Ser 


Glu 


Gly 


Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp Pro 






75 










80 








Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu Leu 


85 






90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp Pro 




100 








105 










110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr Leu 


115 










120 










125 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu Val 




130 










135 








140 


Gly Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His Gly 










145 










150 








Val 


Arg 


Ala 


Val 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 








160 










165 






Leu. 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu Leu 




170 








175 










180 




Ser 


Cys 


Leu 


Thr 


Thr 


Pro 


Ala 


Ser 


Ala 












185 










190 













(2) INFORMATION FOR SEQ ID NO: 194: 

35 
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(i) SEQUENCE CHARACTERISTICS: 

. (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 



15 



25 



Net 

J. 


Ser 


i nr 


Asn 


Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


I fix 


Asn 


Arg 


Arg 


Pro 


- 

Met 


Asp 


Val 


Lys 


10 
Phe 


Pro 


Gly 


Gly 


Gly 


± 3 










20 










25 




Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


" Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 




Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 


Gin 




45 










50 










55 




Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Gin 


Ala 


Arg 


Arg 




Glu 




60 










65 








70 


Ser 


Gly 


Arg 


Ser 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 






Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Gin 


Asn 


Asp 


Pro 




100 










105 










110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Thr 




115 










120 








125 




Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 


Gly 






130 










135 








140 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Ala 


Leu 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 








165 






Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Phe 




170 










175 










180 






Ser 


Cys 


Leu 


Thr 


Thr 


Pro 


Ala 


Ser 


Ala 
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(2) INFORMATION FOR SEQ ID NO: 195: 

30 SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
35 (C) INDIVIDUAL ISOLATE: Z6 



SUBSTITUTE SHEET {RULE 26) 



WO 96/05315 



PCT/US95/10398 



- 217 - 

o 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 



5 



15 



Mo t- 


Coy* 


Thr 


Asn 




T ,vc 

XJ jr O 


nu 


m n 




T A/C! 
XJ jr O 


Thr 

X 11X 


T iVQ 


Arg 


Asn 


1 








c 










1 u 










X Xi X 


li en 
noil 


Arg 


Arg 


XT X \J 


Mpt- 




Val 


XJ Jf 0 




i X »— ' 


Glv 


Gly Gly 


15 










Z U 


















bin 


Tl a 

i le 


Val 


Gly 


Vjriy 


Val 


ryr 


JjcU 


T .01 1 


Jtr xO 


Arg 


Arg 


Gly 


Pro 




30 










3d 










^ u 






Arg 


Leu 


Gly 


Val 


Arg 


ai a 


1 nr 


Arg 


Lys 


1 nr 


Gov* 


ul u. 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


-1-1 

c?in 


Pro 


T 1 0 

lie 


Pro 


Lys 


TV 1 -j 

Aia 


Arg 


Arg 








60 










65 










70 


Ser 


VjIU 


Gly Arg 


Ser 


Trp 


a j. a 


oin 


Pro 




Tyr 


riO 


Trp 


Pro 










75 




















XJC Li 


A 7 r 


Gly Asn 




wi y 




uiy 


Trn 


Ala 


Glv 




Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 










140 


Gly 


Ala 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Ala 


Val 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 








175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Thr 


Ser 


Ala 
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(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 196 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

15 10 
Thr Asn Arg Arg Pro Met Asp Val Lys Phe Pro Gly Gly Gly 

15 20 25 

Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
35 30 35 40 
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10 



15 



65 



155 



Leu 


Gly Val 
45 


Arg 


Thr 


Thr 


Arc? 
50 


Lvs 


Thr 


Ser 


Glu 


55 


Pro 


Arg Gly Arg Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 




Glu 


60 










65 








70 


Gly Arg 






Ala 


Gin 


Pro 


Glv 


Tvr 


Pro 


Tit"* Prn 






/ j 










80 






Tyr 


Gl \7 A Q TI 


Vj-L Li 


\j J, y 


Cys 


Gly 




Ala 


Glv 




T ,D1 1 T 1 


Pro 


Arg Gly 




90 










95 




Ser 


Arg 


Pro 


Ser 


Tro 


Glv 


Pro 


Asn 




100 








105 










110 


Arg 


Arg Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr Leu 




115 








120 








125 


Cys 


Gly Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu Val 


Ala 


130 










135 








140 


Pro Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His Gly 




Ala Leu 


145 










150 






Arg 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 




Gly Cys 




160 










165 




Pro 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu Leu 


170 








175 










180 




Cys 


Leu Thr 


Val 


Pro 


Ala 


Ser 


Ala 
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(2) INFORMATION FOR SEQ ID NO: 197: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : unknown 

{ D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK13 

25 (xi > SEQUENCE DESCRIPTION: SEQ ID NO: 197 

Met Ser Thr Asn 

• 1 

Thr Asn Arg Arg 
15 

Gin lie Val Gly 
30 

30 ^9 Leu Gly Val 

45 

Gin Pro Arg Gly 
60 



85 

35 



Pro 
5 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


Pro 


Met 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 




20 










25 




Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 






35 










40 




Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 








50 










55 




Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 


Ser 








65 








70 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 


75 










80 








Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 




90 










95 
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=*!SDOCID: <WO 960531 5A3JA> 



WO 96/05315 



PCTA3S95/10398 



- 219 - 



5 



Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Ser 


irp 




* X. 




Asp 


Pro 




100 






105 










110 






Arg 


Arg 


Arg 


C a 


Arg 


Asn 


Leu 


Gly 


XJjr 0 


Val 


He 


Asp 


Thr 


Leu 


lib 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 


Val 




130 










135 










140 


Gly 


Ala 


Pro 


Val 


Gly Gly Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 










Val 


Arg 


Leu 


Leu 


Glu 


Asp 


Gly Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 














185 










190 















10 

(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

15 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 



20 



30 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 198: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 








10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 








20 










25 








Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr. 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 






60 










65 










70 


Pro 


Thr 


Gly 


Arg 


Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 








75 










80 










Leu 


Tyr 


Ala 


Asn 


Glu 


Gly 


Leu 


Gly Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 






105 










110 






Arg 


Arg 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 




130 










135 










140 


Gly 


Gly 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 








145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 








160 










165 
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Leu Pro Gly Cys Ser Phe Ser lie Phe He Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 
185 190 



5 (2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199 



15 



Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Glv Glv 

15 20 25 

Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 

30 35 40 

Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser 

45 50 55 

Gin Pro Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg Gin 
20 60 65 70 

Pro Thr Gly Arg Ser Trp Gly Gin Pro Gly Tyr Pro Trp Pro 

75 80 
Leu Tyr Ala Asn Glu Gly Leu Gly Trp Ala Gly Trp Leu Leu 

85 90 95 

Ser Pro Arg Gly Ser Arg Pro Asn Trp Gly Pro Asn Asp Pro 

100 105 110 

Arg Arg Lys Ser Arg Asn Leu Gly Lys Val He Asp Thr Leu 
25 H5 120 125 

Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu Val 
130 135 140 

Gly Gly Pro Val Gly Gly Val Ala Arg Ala Leu Ala His Gly 

145 150 
Val Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn 
155 160 165 

30 Leu Pro Gly Cys Ser Phe Ser He Phe He Leu Ala Leu Leu 
170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 200: 
35 <i) SEQUENCE CHARACTERISTICS: 



SUBSTITUTE SHEET (RULE 26} 



WO 96/05315 



PCT/US95/10398 
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(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 



.15 



25 



Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Lys 


Arg 


Asn 


1 








5 










10 










Thr 


Asn 


Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


lie 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 










35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 








60 










65 










70 


Pro 


Thr 


Gly Arg 


Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 








Leu 


Tyr 


Ala 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 






Arg 


Arg 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 








130 










135 










14 0 


Gly 


Gly 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










14 5 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


He 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 













185 190 



(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) . STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



35 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA1 



SUBSTITUTE SHEET (RULE 26) ■ 



WO 96/05315 



PCTAJS95/10398 
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10 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 



1 

?hr 
15 



85 



Ser 


Thr 


Asn 


Pro 
5 


T iVc 


Dm 


bin 


Arg 


Lys 


rpl_ 

i nr 


Lys 


Arg 


Asn 


Asn 


Leu 


Arg 


Pro 


Gin 




VclX 


Lys 


10 

rfle 


Fro 


Gly 


Gly 


Gly 


lie 








20 














Val 


Glv 


Glv 


Val 


iyr 




lieu 


Pro 


Arg 


Arg 


Gly 


Pro 


30 


Glv 


















4 U 




Leu 


Val 


Arrr 




i nr 


Arg 


Lys 


inr 


Ser 


GlU 


Arg 


Ser 




45 
























Pro 


Arg 




ax y 


Arg 


bin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 


Thr 


Gly 


60 


















70 


Arg 




Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 


fur 

iyr 


A J. <a. 


Asn 




(jiy 


Leu 


Gly 


Trp 


80 
Ala 


Gly 


Trp 


Leu 


Leu 






j. y 




90 










95 






n u 


Arg 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 


100 










105 








110 




Arg 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 




115 










120 








125 




Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Leu 


Val 


Gly 




130 










135 








140 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 




Val 




145 










150 








Arg 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


Pro 


Gly 






160 










165 






Cys 


Ser 


Phe 


Ser 


He 


Phe 


He 


Leu 


Ala 


Leu 


Leu 


170 










175 










180 




Cys 


Leu 


He 


He 


Pro 


Ala 


Ser 


Ala 












185 










190 















155 



20 

(2) INFORMATION FOR SEQ ID NO: 202: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 
25 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA3 

30 SEQUENCE DESCRIPTION: SEQ ID NO: 202: 

Met Ser Thr Asn Pro Lys Pro Gin Arg Lys Thr Lys Arg Asn 

1 - 5 10 

Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly Gly Gly 

15 20 25 

Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 
35 30 35 40 
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10 



15 



20 



25 



30 



Arg 


Leu 


Gly Val 


Aro 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 


Gin 




Pro 


Arg 


Gly 




Ara 


Gin 


Pro 


lie 


Pro 


Lys 


Ala 


Arg 






OO 










65 










70 


Pro 


Thr 


Gly 


Arg 


Ser 


Trp 


Gly 


Gin 


Pro 


Gly 
80 


Tyr 


Pro 


Trp 


Pro 


T ,ei l 

±J\Z LI 


Tyr 


Ala 


Asn 


Glu 


Gly 


Leu 


Glu 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


o o 








90 










95 










Pro 


Arg 


Gly 


Ser 


Ara 


Pro 


Ser 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




TOO 

X V V 






105 










110 






Arq 


Arg 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


115 










120 










125 


Val 


Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


lie 


Pro 


Leu 




130 










135 










140 


Gly 


Gly 


Pro 


Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 






145 










150 










Val 


Arg 


Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


He 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Ser 


Ala 













185 



(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 



35 





(xi) 




SEQUENCE 


1 DESCRIPTION 


f : SEQ ID NO: 203 : 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


1 

Thr 


Asn 


Arg 


Arg 


5 

Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 








20 










25 




Gly 




Gin 


He 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Pro 




30 








35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 


Gin 


Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 






60 








65 










70 


Pro 


Thr 


Gly 


Arg 


Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 








75 










80 










Leu 


Tyr 


Ala 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 








105 










110 
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5 



Arg 


Arg 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 


Thr 




115 










120 








125 




Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly Tyr 


He 


Pro 


Leu 


Val 


Gly 


Gly 




130 










135 










140 


Pro 


Val 


Gly 


Gly Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 


Val 








145 










150 








Arg 


Val 


Leu 


Glu 


Asp 


Gly Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 










160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


He 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 




Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Thr 


Ser 


Ala 
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10 (2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 

Met Ser 

20 1 

Thr Asn 

15 
Gin He 

Gly Val Arg Ala Thr Arg Lys Thr Ser Glu / 

55 

C? I n Prn fi. rrr m ir A >-^-c 7\ ■. — . /~o _ t-> -r -i _ - 

25 

70 



35 155 



Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Thr 


Gin 






5 










10 






Arg 


Arg 


Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 








20 










25 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 










35 








40 


Gly Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


45 










50 








Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 




60 










65 






Gly Arg 


Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Ala 




75 










80 




Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 








90 










95 


Arg 


Gly 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 










105 










110 


Lys 


Ser 


Arg 


Asn 


Leu 


Gly Lys 


Val 


He 


Asp 


115 










120 








Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 




130 










135 






Pro Val 


Gly 


Gly 


Val 


Ala 


Arg 


Ala 


Leu 


Ala 






145 








150 






Val 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 



85 
Ser Pro 
100 

30 Ar 9 ^9 

125 



140 



165 
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Leu Pro Gly Cys Ser Phe Ser lie Phe Val Leu Ala Leu Leu 

170 175 180 

Ser Cys Leu Thr Val Pro Ala Ser Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

10 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA11 



15 



20 



25 



30 





(xi) 




SEQUENCE DESCRIPTION: SEQ ID NO: 205: 


Met 


Ser 


Thr 


Asn 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 
10 


Thr 


Lys 


Arg 


Asn 


1 

Thr 


Asn 


Arg 


Arg 


5 

Pro 


Gin 


Asp 


Val 


Lys 


Phe 


Pro 


Gly 


Gly 


Gly 


15 










20 










25 








Gin 


lie 


Val 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arg 


Gly 


Pro 




30 






35 










40 






Arg 


Leu 


Gly 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 




45 










50 










55 




Gin 


Pro 


Arg 


Gly 


Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 






60 










65 










. 70 


Pro 


Thr 


Gly 


Arg 


Ser 


Trp 


Gly 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 










Phe 


Tyr 


Ala 


Asn 


Glu 


Gly 


Leu 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 








90 










95 








Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


Asn 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 








105 










110 






Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly Lys 


Val 


He 


Asp 


Thr 


Leu 


115 










120 










125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


-Leu 


Val 




130 










135 










14 0 


Gly Gly 


Pro 


Val 


Gly 


Gly Val 


Ala 


Arg 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 










Val 


Arg 


Ala 


Leu 


Glu 


Asp 


Gly 


Val 


Asn 


Tyr 


Ala 


Thr 


Gly Asn 


155 








160 










165 








Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser. lie 


Phe 


He 


Leu 


Ala 


Leu 


Leu 




170 






175 










180 






Ser 


Cys 


Leu 


Thr 


Val 


Pro 


Ala 


Thr 


Ala 














185 










190 















(2) INFORMATION FOR SEQ ID NO: 206: 
35 (i) SEQUENCE CHARACTERISTICS: 



SUBSTITUTE SHEET (RULE 26) 
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10 



15 



20 



226 - 



(A) LENGTH: 191 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 











SEQUENCE DESCRIPTION: i 


bfcjtj 


ID NO: 


>~\ r\ f 

206 : 


Met 


O w x. . 


Thr 


Leu 


Pro 


Lys 


Pro 


Gin 


Arg 


Lys 


Tnr 


Lvs 


Arg 


Asn 


1 
X 








5 










10 






1 XIX 


Asn 


Arg 


Arg 


Pro 


Thr 


Asp 


Val 


Lvs 


Phe 


Pro 


Glv 


Glv 


Gly 


lb 










20 










25 


oin 


lie 


vai 


Gly 


Gly 


Val 


Tyr 


Leu 


Leu 


Pro 


Arg 


Arcr 


Glv 


Pro 




-5 U 










35 










40 




Arg 


Leu 


Pi,. 

vjiy 


Val 


Arg 


Ala 


Thr 


Arg 


Lys 


Thr 


Ser 


Glu 


Arg 


Ser 






4 b 










50 










55 




Gin 


Pro 


Arg 


Gly Arg 


Arg 


Gin 


Pro 


He 


Pro 


Lys 


Ala 


Arg 


Gin 








60 










65 








70 


Pro 


Gin 


Gly 


Arg 


His 


Trp 


Ala 


Gin 


Pro 


Gly 


Tyr 


Pro 


Trp 


Pro 










75 










80 








Leu 


Tyr 


Gly 


Asn 


Glu 


Gly 


Cys 


Gly 


Trp 


Ala 


Gly 


Trp 


Leu 


Leu 


85 










90 










95 






Ser 


Pro 


Arg 


Gly 


Ser 


Arg 


Pro 


His 


Trp 


Gly 


Pro 


Asn 


Asp 


Pro 




100 










105 










110 




Arg 


Arg 


Arg 


Ser 


Arg 


Asn 


Leu 


Gly 


Lys 


Val 


He 


Asp 


Thr 


Leu 






115 










120 








125 




Thr 


Cys 


Gly 


Phe 


Ala 


Asp 


Leu 


Met 


Gly 


Tyr 


He 


Pro 


Val 


Val 


Gly 






130 










135 










140 


Ala 


Pro 


Leu 


Gly 


Gly 


Val 


Ala 


Ala 


Ala 


Leu 


Ala 


His 


Gly 










145 










150 








Val 


Arg 


Ala 


He 


Glu 


Asp 


Gly 


He 


Asn 


Tyr 


Ala 


Thr 


Gly 


Asn 


155 










160 








165 






Leu 


Pro 


Gly 


Cys 


Ser 


Phe 


Ser 


He 


Phe 


Leu 


Leu 


Ala 


Leu 


Leu 




170 










175 










180 






Ser 


Cys 


Leu 


Thr 


Thr 


Pro 


Ala 


Ser 


Ala 
















185 










190 















25 



(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 

^ GCGTCCGGGT TCTGGAAGAC GGCGTGAACT ATGCAACAGG 
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(2) INFORMATION FOR SEQ ID NO:208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 4 0 base pairs 

(B) TYPE: . nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:208: 

AGGCTTTCAT TGCAGTTCAA GG CCGTGCT A TTGATGTGCC 



(2) INFORMATION FOR SEQ ID NO: 20 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:209: 

AAGACGGCGT GAACTATGCA ACAGGGAACC TTCCTGGTTG 



(2) INFORMATION FOR SEQ ID NO: 210: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



10 



15 



. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 210: 

25 AGTTCAAGGC CGTG CTATTG ATGTGCCAAC TGCCGTTGGT 



(2) INFORMATION FOR SEQ ID NO: 211: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 211: 

AAGACGGCGT GAATTCTGCA ACAGGGAACC TTCCTGGTTG 40 
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(2) INFORMATION FOR SEQ ID NO:212: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 212 

AGTTCAAGGC CGTGGAATTC ATGTGCCAAC TGCCGTTGGT 

(2) INFORMATION FOR SEQ ID NO:213: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:213: 

ARCTYCGACG TYACATCGAY CTGCTYGTYG GRAGYGCCAC CC 

(2) INFORMATION FOR SEQ ID NO: 2 14: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



15 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 214 

RCARGCCRTC TTGGAYATGA TCGCTGGWGC Y 

(2) INFORMATION FOR SEQ ID NO: 215: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 
30 <C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 215 

CRATACGACR YCAYGTCGAY TTGCTCGTTG GGGCGGCTRY YT 

35 (2) INFORMATION FOR SEQ ID NO: 2 16: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:216: 

RCAAGCTRTC RTGGAYRTGG TRRCRGGRGC C 31 



(2) INFORMATION FOR SEQ ID NO: 2 17: 

]0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:217: 

15 TTGCGGACKC ACATYGACAT GGTYGTGATG TCCGCCACGC 4 0 



(2) INFORMATION FOR SEQ ID NO: 2 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 
->0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:218: 

GATGCGCGTT CCCGAGGTCA TCWTAGACAT CRTYRGCGGR GCD 4 3 

25 

(2) INFORMATION FOR SEQ ID NO: 219: 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:219: 

AATGGCACCY TGCRCTGCTG GATACAAGTR ACACCTAATG TGGCTGTGAA 50 
ACAC 54 



(i) 



30 



35 (2) INFORMATION FOR SEQ ID NO: 220: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:220: 

TGARCTAGYC CTYSARGTYG TCTTCGGYGG Y 



(2) INFORMATION FOR SEQ ID NO: 221: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 221: 

15 GCCAACGTCT CTCGATGTTG GGTGCCGGTT GCCCCCAATC TCGCCATAAG 
TCAA 



(2) INFORMATION FOR SEQ ID NO:222: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:222: 

25 AAGGGCCTGC GAGCACACAT CGATATCATC GTGATGTCTG CTACGG 



(2) INFORMATION FOR SEQ ID NO: 22 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 223: 

TTGGTGCGCA TCCCGGAAGT CATCTTGGAT ATTGTTACAG GAGGT 



(2) INFORMATION FOR SEQ ID NO: 224: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 224 

AGTCAGGTAY GTCGGAGCAA CCACCGCYTC GATACGCAGT 



(2) INFORMATION FOR SEQ ID NO: 225: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



15 



(xi)/ SEQUENCE DESCRIPTION: SEQ ID NO: 225: 

AGCCTTCACG TTCAGACCKC GTCGCCATCA AACRGTCCAG ACCTGT 

(2) INFORMATION FOR SEQ ID NO:226: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 226: 



25 



TCCCCCGCYG TGGGTATGGT GGTRGCGCAC RTYCTGCGDY TGCCCCAGAC 
CKTGTTYGAC ATAMTRGCYG GGGCC 



(2) INFORMATION FOR SEQ ID NO:227: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 227 

ACGCCGGTGA CGCCTACAGT GGCTGTCGCA CACCCGGGC 



35 (2) INFORMATION FOR SEQ ID NO: 228: 
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(i) • SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:228: 

ATGAGGGTCC CCACAGCCTT TCTCGACATG GTTGCCGGAG GC 42 

(2) INFORMATION FOR SEQ ID NO: 229: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 229: 

15 CGCGCCCTAT CCCAACGCAC CGTTAGAGTC CATGCGCAGG 4 0 

(2) INFORMATION FOR SEQ ID NO: 23 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 230: 

TCAGATCTTA CGGATCCCCT CTATCCTAGG TGACTTGCTC ACCGGGGGT 4 9 

(2) INFORMATION FOR SEQ ID NO: 231: 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 231: 

CAGTCACGCT GCTGGGTGGC CCTTACTCCC ACCGTGGCGG YGYCTTATAT 50 



CGGT 



54 



35 
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(2) INFORMATION FOR SEQ ID NO:232: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:232: 

TAGCACTCTG GTRGAYCTAC TCRCTGGAGG G 



(2) INFORMATION FOR SEQ ID NO:233: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:233: 

15 AAGTCTACAT GCTGGGTGTC TCTCACCCCC ACCGTGGCTG CGCAACATCT 50 
GAAT 54 



(2) INFORMATION FOR SEQ ID NO:234: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:234: 

AGGCGCCATG GTCGACCTGC TTGCAGGCGG C 31 

25 

(2) INFORMATION FOR SEQ ID NO: 235: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
3 q (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 235: 

TCAGCCCCGA VYYTCGG AG C GGTCACGGCT CCTCTTCGGA GGG 43 



35 
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(2) INFORMATION FOR SEQ ID NO : 236 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 236: 

TGYTACGGAT YCCCCARGTG GTCATHGACA TCATWGCCGG GGSC 44 

(2) INFORMATION FOR SEQ ID NO: 237: 

!0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 7: 

15 CATACCAAAT GCTTCCACGC CCGCAACGGG ATTCCGCAGG 4 0 

(2) INFORMATION FOR SEQ ID NO: 23 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 238: 

TCTTCTTGCG GGCGCCGCAG TGGTTTGCTC ATCCCTG 3 7 

25 (2) INFORMATION FOR SEQ ID NO: 23 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:239: 

ATCTAGCATC TTGAGGGTAC CTGAGATTTG TGCGAGTGTG ATATTTGGTG 50 
GC 52 
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(2) INFORMATION FOR SEQ ID NO: 24 0: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TO POLOG Y : unknown 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:240: 

Trp lie Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala 

5 10 15 

Leu Thr His Asn Leu Arg Xaa His Xaa Asp Xaa lie Val Met Ala 

20 25 30 

Ala Thr Val 

(2) INFORMATION FOR SEQ ID NO: 241: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 241: 



Trp Val Pro Val Ala Pro Asn Leu Ala lie Ser Gin Pro Gly Ala 

5 10 15 

Leu Thr Lys Gly Leu Arg Ala His lie Asp lie lie Val Met Ser 

20 25 30 

20 Ala Thr Val 

(2) INFORMATION FOR SEQ ID NO:242: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

25 (C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:242: 

Trp lie Pro Val Xaa Pro Asn Val Ala Val Xaa Xaa Pro Gly Ala 

5 10 15 

in Leu Thr Gin Gly Leu Arg Thr His lie Asp Met Val Val Met Ser 

20 25 30 

Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO: 243: 

(i) SEQUENCE CHARACTERISTICS: 

35 (A) LENGTH: 33 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:243: 

Trp Thr Xaa Val Thr Pro Thr Val Ala Val Arg Tyr Val Gly Ala 

5 10 15 

Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val Gly Ala 

20 25 30 

Ala Thr Xaa 



(2) INFORMATION FOR SEQ ID NO:244: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

SEQUENCE DESCRIPTION: SEQ ID NO: 244: 

Leu Xaa Pro Thr Leu Ala Ala Arg Asn Xaa Xaa Xaa 
5 10 15 

Xaa He Arg Xaa His Val Asp Leu Leu Val Gly Ala 
20 25 30 



(xi) 

Id 

Trp Val Ala 
Xaa Thr Xaa 
Ala Xaa Phe 



20 

(2) INFORMATION FOR SEQ ID NO:245: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

SEQUENCE DESCRIPTION: SEQ ID NO: 24 5: 

Trp Val Xaa Xaa Xaa Pro Thr Val Ala Thr Arg Asp Gly Lys Leu 

5 10 15 

Pro Xaa Xaa Gin Leu Arg Arg Xaa He Asp Leu Leu Val Gly Ser 

20 25 30 

30 Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO:246: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



25 



(i) 



(xi; 



(i> 



35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:246: 

Trp Thr Pro Val Thr Pro Thr Val Ala Val Ala His Pro Gly Ala 

5 10 15 

Pro Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala 
.20 25 30 

5 Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO:247: 

(i) SEQUENCE CHARACTERISTICS: 

(A) . LENGTH: 33 amino acids 
10 (B) TYPE: amino acid 

<C) STRANDEDNESS : unknown 
(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 7: 

Trp Val Ala Leu Thr Pro Thr Val Ala Xaa Xaa Tyr lie Gly Ala 

5 " . 10 15 

15 Pro Leu Xaa Ser Xaa Arg Arg His Val Asp Leu Met Val Gly Ala 

20 - 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 24 8: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:248: 

25 Trp Val Ser Leu Thr Pro Thr Val Ala Ala Gin His Leu Asn Ala 

5 10 15 

Pro Leu Glu Ser Leu Arg Arg His Val Asp Leu Met Val Gly Gly 

20 25 30 

Ala Thr Leu 

30 (2) INFORMATION FOR SEQ ID NO: 24 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
. (D) TOPOLOGY: unknown 

35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24 9: 

Trp Val Pro Leu. Thr Pro Thr Val Ala Ala Pro Tyr Pro Asn Ala 

5 10 15 

Pro Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Met 

5 

(2) INFORMATION FOR SEQ ID NO: 250: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:250: 

Trp Val Xaa lie Thr Pro Thr Leu Ser Ala Pro Xaa Xaa Gly Ala 

5 10 15 

Val Thr Ala Pro Leu Arg Arg Xaa Val Asp Tyr Leu Ala Gly Gly 

20 25 30 
Ala Ala Leu •■ . 

(2) INFORMATION FOR SEQ ID NO: 251: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 51: 

Trp His Ala Val Thr Pro Thr Leu Ala lie Pro Asn Ala Ser Thr 

5 10 15 

Pro Ala Thr Gly Phe Arg Arg His Val Asp Leu Leu Ala Gly Ala 

20 25 30 

Ala Val Val 



25 



(2) INFORMATION FOR SEQ ID NO:252: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



35 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:252: 

Thr Leu Thr Met lie Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu 

5 10 15 

Xaa Leu Xaa Val Val Phe Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO:253: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:253: 

Thr Thr Thr Met Leu Leu Ala Tyr Leu Val Arg lie Pro Glu Val 

5 10 15 

lie Leu Asp lie Val Thr Gly Gly 

20 

15 (2) INFORMATION FOR SEQ ID NO: 2 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:254: 

Thr Xaa Thr Xaa lie Leu Ala Tyr Xaa Met Arg Val Pro Glu Val 

5 10 15 

lie Xaa Asp lie Xaa Xaa Gly Ala 

20 

25 ( 2 ) INFORMATION FOR SEQ ID NO: 255: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 255: 

Ala Val Gly Met Val Val Ala His Xaa Leu Arg Leu Pro Gin Thr 

5 10 15 

Xaa Phe Asp lie Xaa Ala Gly Ala 

20 
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(2) INFORMATION FOR SEQ ID NO: 256: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 256: 

Thr Xaa Ala Leu Val Xaa Ser Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Xaa Asp Xaa Val Xaa Gly Ala 

20 

(2) INFORMATION FOR SEQ ID NO: 257: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
13 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:257: 

Thr Xaa Ala Leu Val Xaa Ala Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Leu Asp Met lie Ala Gly Ala 
20 20 

(2) INFORMATION FOR SEQ ID NO: 258: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 258: 

Thr Thr Thr Leu Leu Leu Ala Gin He Met Arg Val Pro Thr Ala 

5 10 15 

30 Phe Leu Asp Met Val Ala Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 259: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 
35 (B) TYPE: amino acid 
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O 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:259: 

Thr Thr Thr Leu Xaa Leu Ala Gin Val Met Arg lie Pro Ser Thr 

5 10 15 

Leu Val Asp Leu Leu Xaa Gly Gly 

20 - 

(2) INFORMATION FOR SEQ ID NO: 260: 

(i) SEQUENCE CHARACTERISTICS: * 

(A) LENGTH: 23 amino acids 
10 (B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:260: 



15 



Thr Ala Thr Leu Val Leu Ala Gin Leu Met Arg lie Pro Gly Ala 

.5 10 15 

Met Val Asp Leu Leu Ala Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 261: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 261: 

Thr Ser Ala Leu lie Met Ala Gin lie Leu Arg lie Pro Ser lie 
25 . 5 10 15 

Leu Gly Asp Leu Leu Thr Gly Gly 

20 



(2) INFORMATION FOR SEQ ID NO : 262 : 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 262: 

35 
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° Xaa Thr Ala Leu Xaa Met Ala Gin Xaa Leu Arg lie Pro Gin Val 

5 10 15 

Val lie Asp lie lie Ala Gly Xaa 

20 

(2) INFORMATION FOR SEQ ID NO:263: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
CD) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:263: 

10 Thr Thr Thr Leu Val Leu Ser Ser lie Leu Arg Val Pro Glu lie 

5 10 15 

Cys Ala Ser Val lie Phe Gly Gly 

20 



15 



20 



25 



30 
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CLAIMS 

1. A purified and isolated DNA having a 
sequence selected from the group consisting of SEQ ID NO:l 
through SEQ ID NO: 51. 

2. A purified and isolated protein encoded by a 
gene whose sequence includes a sequence selected from the 
group consisting of 1 SEQ ID NO: 52 through SEQ ID NO: 102. 

3 . A purified and isolated DNA having a 
sequence selected from the group consisting of SEQ ID NO: 
103 through SEQ ID NO: 154. 

4 . A purified and isolated protein encoded by a 
gene sequence selected from the group consisting of SEQ ID 
NO: 155 through SEQ ID NO: 206. 

5. A purified and isolated protein having an 
amino acid sequence selected from the group consisting of 
SEQ ID NO: 52 through SEQ ID NO: 102 and SEQ ID NO: 155 
through SEQ ID NO: 206. 



6 . A method for the recombinant DNA-directed 
synthesis of a protein, said method comprising: 

culturing a transformed or transfected host 

25 organism containing a DNA sequence capable 

of directing the host organism to produce 
said protein under conditions such that the 
protein is produced, said protein exhibiting 
substantial homology to a protein comprising 

30 the amino acid sequence selected from the 

group consisting of SEQ ID NO: 52 through SEQ 
ID NO: 102 or SEQ ID NO: 155 through SEQ ID 
NO: 206 . 

35 7. The method of claim 6, wherein the host 
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organism is transfected with a recombinant eukaryotic 
expression vector. 

8. The method of claim 7, wherein the host 
organism is a eukaryotic cell. 

9. A recombinant expression vector comprising a 
DNA sequence selected from the group consisting of SEQ ID 
N0:1 through SEQ ID NO: 51 and SEQ ID NO: 103 through SEQ ID 
NO: 154. 

10. A host organism transformed or transfected 
with a recombinant expression vector according to claim 9. 

11. A method of detecting antibodies against 
HCV, said method comprising: 

(a) contacting a biological sample with at 
least one protein of claim 5 to form an 
immune complex with the antibodies; and 

(b) detecting the presence of the immune 
complex. 

12. The method of claim 11 wherein the 
biological sample is selected from the group consisting of 
serum, saliva or lymphocytes or other mononuclear cells. 

25 

13. The method of claim 11, wherein the 
recombinant protein is bound to a solid support. 

14. The method of claim 11, wherein the immune 
30 complex is detected using a labeled antibody. 

15. A hepatitis C virus kit comprising: at least 
one protein comprising an amino acid sequence selected from 
the group consisting of: SEQ ID NO: 52 through SEQ ID NO: 102 

35 and SEQ. ID NO:155 through SEQ ID NO:206. 
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° 16. A composition comprising at least one 

recombinant protein of claim 5 and an excipient, diluent or 
carrier. 

17. A composition comprising an expression 
5 vector capable of directing host organism synthesis of a 
protein having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 52 through SEQ ID NO: 102 
and SEQ ID NO: 155 through SEQ ID NO: 206, 

10 18. A method of preventing hepatitis C 

infection, comprising administering the composition of 
claim 16 or 17 to a mammal in an effective amount to 
stimulate the production of protective antibody. 



15 



19. A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one protein 
according to claim 5 in a pharmacologically acceptable 
carrier. 

20 20. A vaccine for immunizing a mammal against 

hepatitis C infection, said vaccine comprising an 
expression vector capable of directing host organism 
synthesis of a protein having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 52 - SEQ ID 

25 NO: 102 and SEQ ID NO: 155 - SEQ ID NO: 206. 

21. A method for detecting the presence of the 
hepatitis C virus via a reverse transcription-polymerase 
chain reaction, said method comprising amplifying an HCV 

30 reverse transcription product by polymerase chain reaction 
using universal primers . 

22. The method of claim 21, wherein said 
universal primers are deduced from universally conserved 

35 nucleotide domains found in SEQ ID NO: 1 through SEQ ID NO: 
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° 51, in SEQ ID NO: 103 through SEQ ID NO: 154, or in 
consensus sequences shown in Figures 1A-H and 6A-K. 

23. Substantially isolated and purified 
universal primers, wherein said primers have nucleic acid 
5 sequences derived from universally conserved nucleotide 

domains found in SEQ ID NO:l through SEQ ID NO: 51, in SEQ 
ID NO: 103 through SEQ ID NO: 154 and in consensus sequences 
showing Figures 1A-H and 6A-K. 

10. 24. A diagnostic kit for use in detecting the 

presence of hepatitis C virus in a biological sample, said 
kit comprising at least two universal primers according to 
claim 22. 

25. A diagnostic kit for use in detecting the 
presence of hepatitis C virus is a biological sample, said 
kit comprising at least one nucleic acid sequence selected 
from the group consisting of SEQ ID No: 1-51 or SEQ ID 
No:103-154. 

26. A method for determining the genotype of a 
hepatitis C virus, said method comprising : 

amplifying reverse transcription 
products of RNA via polymerase chain 
reaction using genotype -specific 
amplification primers deduced from 
genotype- specific nucleotide domains 
found in SEQ ID NO:l through SEQ ID 
NO: 51, in SEQ ID NO: 103 through SEQ ID 
NO: 154, or in consensus sequences, shown 
in Figures 1A-H and 6A-K. 

27. A method for determining the genotype of a 
hepatitis C virus, said method comprising: 

35 (a) amplifying RNA via reverse 
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° - transcription-polymerase chain reaction 

to produce amplification products; 

(b) contacting said products with at least 
one sequence shown in SEQ ID NO:l 
through SEQ ID NO: 51 and SEQ ID NO: 103 

5 through SEQ ID NO: 154; and 

(c) detecting complexes of said product 
. which bind to said nucleic acid 

sequence. 

28. A method for determining the genotype of a 
hepatitis C virus, said method comprising: 

(a) amplifying RNA via reverse 
transcription-polymerase chain reaction 
to produce amplification products; 

(b) contacting said products with at least 
one genotype-specific oligonucleotide; 
and 

(c) detecting complexes of said products 
which bind to said oligonucleotide (s) . 

29. The method of claims 27 or 28, wherein said 
amplification of step (a) uses universal primers deduced 
from universally conserved nucleotide domains found in SEQ 
ID N0:1 through SEQ ID NO: 51, in SEQ ID NO: 103 through SEQ 
ID NO: 154, or in consensus sequences shown in Figures 1A-H 
and 6A-K. 

30. The method of claim 28, wherein said 
genotype-specific oligonucleotide of step (b) is a nucleic 
acid sequence deduced from genotype-specific nucleotide 
domains found in. SEQ ID NO:l through SEQ ID NO: 51 and SEQ 
ID NO: 103 through SEQ ID NO: 154, or in consensus sequences 
shown in Figures 1A-H and 6A-K. 

31. Substantially isolated and purified 
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° genotype -specific oligonucleotides, wherein said 

oligonucleotides have nucleic acid sequences deduced from 
genotype- specific nucleotide domains found in SEQ ID NO:l 
through SEQ ID NO: 51, in SEQ ID NO: 103 through SEQ ID 
NO: 154, or in consensus sequences shown in Figures 1A-H and 

5 6A-K. 

32. Substantially purified and isolated 
genotype-specific peptides having amino acid sequences 
deduced from a genotype -specific amino acid domains located 

10 in SEQ ID NO:52 through SEQ ID NO:102, in SEQ ID NO:155 

through SEQ ID NO: 206, or in consensus sequences shown in 
Figures 2A-H and 7A-K. 

33 . A method of detecting antibodies specific 
15 for a single genotype of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one peptide of claim 32 to form 
an immune complex with the antibodies, 
and 

20 (b) detecting the presence of the immune 

complex . 

34. The method of claim 33, wherein the 
biological sample is selected from the group consisting of 

25 serum, saliva or lymphocytes or other mononuclear cells. 

35. The method of claim 33, wherein said peptide 
.is bound to a solid support. 

30 36 • Th « method of claim 33, wherein the immune 

complex is detected using a labelled antibody or antigen. 

37. A kit for use in detecting antibodies 
specific for a single genotype of HCV, said kit comprising: 
35 at least one peptide selected from the genotype- specific 
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° peptides of claim 32 . 

38. Substantially purified and isolated 
universal peptides having amino acid sequences deduced from 
universally conserved amino acid domains found in SEQ ID 

5 NO: 52 through SEQ ID NO: 102 , in SEQ ID NO: 155 through SEQ 
ID NO: 206, or in consensus sequences shown in Figures 2A-H 
and 7A-K. 

39. A method of detecting antibodies against all 
10 genotypes of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one peptide of claim 38 to form 
an immune complex with the antibodies, 
and 

15 (b) detecting the presence of the immune 

complex. 

40. The method of claim 39, wherein the 
biological sample is selected from the group consisting of 

20 serum, saliva or lymphocytes or other mononuclear cells. 

41. The method of claim 39, wherein said peptide 
is bound to a solid support. 

25 42. The method of claim 39, wherein the immune 

complex is detected using a labelled antibody or antigen. 

43. A composition comprising at least one 
peptide of claim 32 and an excipient, diluent or carrier. 

30 

. 44 . A composition comprising at least one 
peptide of claim 38 and an excipient, diluent or carrier. 

45. A method of preventing hepatitis C 
<i<r infection, comprising administering the composition of 
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claims 43 or 44 to a mammal in an effective amount to 
stimulate production of a protective antibody. 

46. A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one peptide 
according to claims 32 or 3 8 in a pharmaceutically 
acceptable carrier. 



47. A composition comprising at least one 
expression vector capable of directing host organism 

10 synthesis of a genotype -specif ic peptide having amino acid 
sequence deduced from a genotype- specif ic amino acid domain 
located in SEQ ID NO: 52 - SEQ ID NO: 102, and SEQ ID NO: 155 
- SEQ ID NO: 206, or in consensus sequences shown in figures 
2A-H and 7A-K. 

15 

48. A composition comprising at least one 
expression vector capable of directing host organism 
synthesis of a universal peptide having amino acid sequence 
deduced from universally conserved amino acid domains found 

20 in SEQ ID NO: 52 - SEQ ID NO: 102, and SEQ ID NO: 155 - SEQ ID 
NO: 206, or in consensus sequences shown in figures 2A-H and 
7A-K. 



49. A method of preventing hepatitis C 

25 infection, comprising administering the composition of 
claims 47 or 48 to a mammal in an effective amount to 
stimulate production of a protective antibody. 

50. A vaccine for immunizing a mammal against 

30 hepatitis C infection, said vaccine comprising at least one 
expression vector capable of directing host organism 
synthesis of a geno-type specific peptide having amino acid 
sequence deduced from a geno type-specific amino acid 
domain located in SEQ ID NO: 52 - SEQ ID NO: 102, and SEQ ID 

35 NO: 155 - SEQ ID NO: 206, or in consensus sequences shown in 
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° figures 2A-H.and 7A-K. 

51. A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one expression 
vector capable of directing host organism .synthesis of a 
5 universal peptide having .amino acid sequence deduced from 
universally conserved amino acid domain found in SEQ ID 
NO: 52 - SEQ ID NO : 102 , and SEQ ID NO: 155 - SEQ ID NO: 206, 
or in consensus sequences shown in figures 2A-H and 7A-K. 

10- 52. Ant i-HCV core antibodies having specific 

binding affinity for core protein of a single genotype of 
HCV. 

53. Ant i-HCV envelope 1 antibodies having 
15 specific binding affinity for envelope 1 protein of a, 

single genotype of HCV. 

54. The antibodies of claims 52 or 53 wherein 
said antibodies are monoclonal antibodies. 

20 

55. A method of detecting core protein specific 
for a single genotype of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one antibody of claim 52 to form 

25 . an immune complex with said core . 

protein, and 

(b) detecting the presence of the immune 
complex. 

30 56. A method of detecting El protein specific 

for a single genotype .of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one antibody of claim 53 to form 
an immune complex with said El protein 
35 and 
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(b) detecting the presence of the immune 
complex. 

57. The methods of claims 55 or 56, wherein the 
biological sample is selected from the group consisting of 

5 serum, saliva lymphocytes or other mononuclear cells and 
liver. 

58. The method of claims 55 or 56, wherein said 
antibody is bound to a solid support. 

10 

59. A method of detecting antibodies against all 
genotypes of HCV , said method comprising: 

(a) contacting a biological sample with at 
least one universal peptide of claim 38 

15 to form an immune complex with said 

antibodies; and 

(b) detecting the presence of the immune 
complex. 

20 



25 



30 



35 
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claims : 

1. 1,2,5-51,53,54,56 to 59 (partially): 

Genotypes specific peptides from El Seq. ID 1-8 and 52-59 used 
in the recomblannt protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype I/la. 

2. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 103-108 and 
155-160 used 1n the recombinant protein expression, detection 
of antibodies against HCV, vaccines and methods of detection 
using PCR primers and Identification of Genotype I/la. 

3. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 9-25 and 60-76 used 
1n the recomblannt protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype Il/lb. 

4. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 109-124 and 161-176 

used in the recombinant protein expression, detetlon of antibodies 

against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype Il/lb. 

5. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 26-29 and 77-80 used 
1n the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype III/2a. 

6. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 125-128 and 177-180 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype III/2a. 

7. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 30-33 and 81-84 used in 
the recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype IV/2b. 
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8. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 129-133 and 181-185 

used in the recombinant protein expression, detection of antibodies 

against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype IV/2b. 

9. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 34 and 85 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype IV/2c. 

10. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 134 and 186 used in 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype IV/2c. 

11. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 35-39 and 86-90 used 
in the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype V/3a. 

12. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 135-138 and 187-190 
used in the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype V/3a. 

13. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 40 and 91 used 1n the 
recombiannt protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4a. 

14. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 139 and 191 used in 
the recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4a. 
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15. 1,2.5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 41 and 92 used 1n the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4b. 

16. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 141 and 193 used in 

the recombiannt protein expression, detection of antibodies aginst 

HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4b. 

17. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 42-43 and 93-94 used 
in the recombinant protein expression, detectln of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 4c. 

18. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 143-144 and 195-196 
used 1n the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype 4c. 



19. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 44 and 95 used in the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4d. 

20. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 145 and 197 used 1n 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 4d. 

21. 3-52,55 and 59 (partially): 

Genotype specific peptides Core Seq. ID 142 and 194 used 1n the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 4e. 
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22. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 140 and 192 used 1n 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and raetjods of detection using PCR primers 
and Identification of Genotype 4f. 

23. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 45-50 and 96-101 used 
1n the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 5a. 

24. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. 146-153 and 198-205 
used 1n the recombinant protein expression, detection of anti- 
bodies against HCV, vaccines and methods of detection using PCR 
primers and Identification of Genotype 5a. 



25. 1,2,5-51,53,54,56 to 59 (partially): 

Genotype specific peptides from El Seq. ID 51 ans 102 used 1n the 
recombinant protein expression, detection of antibodies against 
HCV, vaccines and methods of detection using PCR primers and 
Identification of Genotype 6a. 

26. 3-52,55 and 59 (partially): 

Genotype specific peptides from Core Seq. ID 154 and 206 used 1n 
the recombinant protein expression, detection of antibodies 
against HCV, vaccines and methods of detection using PCR primers 
and Identification of Genotype 6a. 
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